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FACT OR OPINION: A SOCIOLINGUISTIC VIEW OF NATIVE-SPEAKER 
INTUITIONS AS EVIDENCE IN LINGUISTICS 


Charles Boberg 
McGill University 


this paper addresses the conference theme —the nature of evidence in linguis¬ 
tics—from a sociolinguistic point of view. In particular, it examines the extent to 
which native-speaker intuitions, the standard kind of evidence in much of contem¬ 
porary theoretical linguistics, can provide reliable evidence of a speaker’s grammar. It 
will attempt to show, by means of examples from a study of sound change in progress 
in North American English, that what native speakers tell a linguist about their gram¬ 
mar only sometimes aligns with the facts about that grammar that might be arrived 
at through empirical observation. 

1. introduction: three kinds of evidence. There are three kinds of evidence in lin¬ 
guistics. Given the centrality of the question of what counts as evidence to the nature of 
scientific inquiry, we can to some extent propose a typology of evidence as the basis for 
a corresponding typology of linguistic science that is somewhat different from the usual 
one. Elistorical linguistics has traditionally made use of written texts and comparative 
reconstruction, the only kind of evidence available on language as it was spoken before 
the advent of the tape recorder. By contrast, branches of linguistics concerned primar¬ 
ily with the linguistic present rather than the linguistic past have preferred to study 
spoken language. Of these, traditional dialectology and much of contemporary theoret¬ 
ical linguistics have usually collected data by consulting the intuitions of native speakers 
about their grammars, elicited by means of questionnaires, fieldwork and grammatical - 
ity judgments, rather than by observing the unconscious use of language in everyday 
acts of communication. This approach of course requires an assumption that speakers 
have conscious access to their grammars, and that their intuitions about them are accu¬ 
rate and can be coherently and intelligibly reported. 

The third kind of evidence in linguistics is the empirical observation of speech, in 
which the immediate object of study is linguistic production itself, rather than speak¬ 
ers’ reports or opinions about language. The approach of examining what speakers do 
rather than what they say they do has been favored by phonetics and sociolinguistics, 
for different reasons. In phonetics, it is necessitated by the obscurity of subphonemic 
differences in sound. Without phonetic training, an informative discussion of the 
subject matter is impossible. Even if speakers can hear the relevant differences in 
their own speech (which sometimes they cannot), they possess no vocabulary with 
which to comment on them. The sociolinguistic preference for empirical data is more 
a matter of choice. Data on social variation in language could be (and sometimes is) 
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gathered by eliciting native speakers’ intuitions about it, and sociolinguistics could be 
primarily about the influence of social factors on speakers’ opinions about language. 
The innovators of the field, however, pursued instead an empirical methodology, 
because they explicitly rejected the assumption that speakers’ intuitions are an accu¬ 
rate and reliable source of data on grammar. 

2. SOCIOLINGUISTICS AND THE EMPIRICAL OBSERVATION OF LANGUAGE. One of the 

cornerstones of the sociolinguistic enterprise has been an insistence upon the empiri¬ 
cal observation of language in its social context—of language used in everyday acts 
of communication among members of a speech community—as the primary method 
for studying the rules and representations of grammar. This insistence arises from 
what Labov (1972:209) has termed the observer’s paradox: that the nature of what 
we want to observe is distorted by the very act of observation. Many sociolinguistic 
studies have shown that speakers shift their production in the direction of what they 
perceive to be the standard or most prestigious variety of the language when 
they know they are being observed, or when they are compelled to pay attention to 
their speech by direct questions about it. The most that direct elicitation of linguistic 
data can therefore hope to achieve is an indication of a speaker’s opinion of the cor¬ 
rect way to speak. It cannot usually determine how people speak when they are not 
being observed and not paying attention to their speech—the vernacular, in sociolin- 
guistic parlance. Because the primary purpose of language is communication among 
members of the speech community rather than the discussion of language with lin¬ 
guists, vernacular speech is held by sociolinguists to be the primary object of linguis¬ 
tic study. Moreover, sociolinguistic studies have shown the vernacular to be the most 
consistent and systematic level of linguistic production, free of self-conscious and 
sporadic attempts at correction, and therefore the best source of data on a speaker’s 
grammar (ibid 208). 

Three principal methods have been devised for overcoming the observer’s paradox 
and gaining access to the vernacular. Perhaps the most famous is the rapid and anony¬ 
mous survey technique introduced by Labov in his study of the vocalization of /r/ in 
New York City department stores (Labov 1966). This approach attacked the problem 
of observation simply and directly by collecting data in a context in which speakers 
were not aware that they were being observed. By sacrificing the need for background 
information on speakers beyond what could be visually observed, and by limiting the 
object of study to a single linguistic variable, Labov disguised his investigation as an 
inquiry about the location of merchandise in a store (items known to be on the fourth 
floor). This allowed him to observe the unconscious and unreflecting use of language 
by a wide sample of members of the speech community for ordinary purposes con¬ 
nected with their daily activities. Even in this study, a small shift in the direction of 
the standard (/reinsertion) was found to occur when store employees were asked to 
repeat their response. 

The second and most common method of eliciting data on the vernacular is the 
sociolinguistic interview, a lengthy interaction between linguist and informant in 
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which a wide range of speech styles is elicited, as in the urban surveys carried out by 
Labovin New York City (1966) and Trudgill in Norwich (1974). Elicitation techniques 
range from the discussion of minimal pairs and the reading of word lists, which elicit 
very formal, self-conscious speech, to encouraging the subject to tell stories about 
emotionally charged personal experiences. These narratives elicit a style that is as 
close to the vernacular as is possible in the presence of a linguist and a tape recorder, 
because the subject is so concerned with the task of communicating the emotional 
content of the story that very little attention can be paid to linguistic form. Rather 
than studying only vernacular speech, then, sociolinguistic interviews control speech 
style as an independent variable, to be considered along with purely linguistic factors 
and the external, social characteristics of the speaker in the study of language. 

A third method of overcoming the observer’s paradox is less commonly practiced, 
because it requires a much greater investment of time. This is the ethnographic tradi¬ 
tion of participant-observation, borrowed from anthropology, in which the linguist, 
perhaps over the course of a year or more, becomes sufficiently involved in the speech 
community under study that the presence of an observer is no longer remarkable to 
its members. The observer may, in fact, play a role in the community other than that of 
observer; this role may even be a condition of the possibility of discreet observation. 
In these circumstances, a long-term study can be made of daily linguistic interactions 
that are presumably free of self-monitoring on the part of speakers. Examples of this 
approach are Cheshire’s study of playgrounds in Reading (1978) and Eckert’s study of 
a high school in Livonia, near Detroit (1988). 

3. practical limitations on empirical observation. As valuable as empirical 
observation may be, it is not always practical. It has already been pointed out that 
rapid and anonymous surveys cannot gather background information on speakers 
beyond what can be observed visually (sex, approximate age, occupational status 
and physical context). In a study where residential history is essential to establish, 
for example, a rapid and anonymous survey would not be feasible. Moreover, rapid 
and anonymous studies impose severe restrictions on the type of variable that can be 
examined. It is relatively easy to elicit an interaction that will feature a phonological 
variable, but much harder to ensure that a morphological or syntactic variable will be 
part of the elicited response. Further restrictions are imposed by ethical objections to 
surreptitious tape recording: rapid and anonymous data must be transcribed on the 
scene, by hand, immediately after they are gathered. This requirement means that 
the study has to focus on only one or two variables, since a larger range of observa¬ 
tions would be impossible to remember and transcribe accurately. Finally, rapid and 
anonymous surveys and participant-observer studies require social and linguistic 
access to the speech community, which is not always available when linguists work on 
languages other than their own. 

Sociolinguistic interviews can overcome many of the limitations of rapid and 
anonymous surveys. Indeed, Labov (1972:60-62) has observed that these two meth¬ 
ods have complementary strengths and weaknesses, so that convergence of the results 
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obtained by each method provides strong confirmation of the accuracy of the data. 
However, sociolinguistic interviews and the analysis of long samples of recorded 
speech require considerable time and resources on the part of both linguists and 
subjects, which are sometimes in short supply or difficult to obtain. Conducting 
interviews may in fact be impossible, if the linguist is studying a language spoken in a 
distant place but cannot travel there. Even when it is possible, there are limitations on 
what sort of data can be effectively studied in natural speech. For instance, whereas 
most phonetic variables occur frequently in any quantity of speech, the development 
of syntactic theory sometimes depends on the grammaticality of certain crucial 
constructions that occur so rarely in natural speech that collecting and analyzing 
vernacular data would be overly time-consuming. In these cases, directly eliciting 
grammaticality judgments may seem the only reasonable way to proceed. 

Even the study of phonetic and phonological variation can sometimes benefit 
from the use of word lists and reading passages to ensure an adequate quantity of 
data on the realization of variables in less frequent allophonic environments, or to 
eliminate confounding variables like stress and phonological context in the compari¬ 
son of such environments. The study of phonological mergers and splits, either as part 
of the synchronic description of the phonemic inventory of a language, or as part of 
a study of sound change in progress, necessarily involves the elicitation of native- 
speaker judgments, since Labov (1994:293-418) has shown that speakers’ perception 
of phonemic contrasts is sometimes quite different from their production of the same 
contrasts, and that the status of both must be established if a clear understanding of 
the nature of phonemic systems and of the mechanisms of phonemic change is to be 
arrived at. Minimal pairs that feature the phonological variable in identical contexts 
are therefore the phonologist’s equivalent of the grammaticality tests of syntacticians: 
speakers are asked whether two words rhyme, or sound the same or different, in order 
to establish the phonemic status of the single sound by which they might differ. 

When any one of the above conditions applies, even sociolinguists may be tempted 
to rely on native-speaker intuitions, either explicitly articulated in grammaticality 
judgments and minimal pair tests or performed in the reading of word lists or pas¬ 
sages, as evidence of the linguistic facts they seek to uncover. The remainder of this 
paper will present data that illuminate the conditions under which the cautious use 
of intuitions as evidence in linguistics is more or less reliable. The focus will be on 
the case of establishing the presence or absence of phonemic contrast, since this usu¬ 
ally involves both the elicitation of judgments from the speaker and an independent 
assessment of contrast by the analyst, therefore providing the possibility of a direct 
comparison between the two approaches. 

4. THE STUDY OF PHONEMIC CONTRAST: DATA FROM THE TELSUR PROJECT. A Crucial 

issue in the description of any language is the establishment of its phonemic inven¬ 
tory, the set of contrasting sounds that underlies its ability to relate meaning to sound 
in a linguistic signaling system. It is well known that phonemic inventories are nei¬ 
ther diachronically stable nor uniform across dialects. Phonemic contrasts can arise 
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or disappear over time, and a contrast made in one dialect can be absent in another. 
Labov has shown that one of the most fundamental and consequential ways in which 
dialects of North American English differ is in the number and nature of phonemic 
contrasts they maintain in their vowel systems (Labov 1991). Some dialects maintain 
a contrast between two historical developments of Middle English short-a, so that 
past and pat have different vowels, while most North Americans have the same vowel 
in these words. A more general divide in North American English is between dia¬ 
lects that distinguish /a/ and h:l, the vowels of cot and caught, and dialects in which 
they are merged as a single phoneme. Beyond these general, unconditioned mergers, 
there are conditioned neutralizations of contrast that occur in specific phonetic envi¬ 
ronments in certain dialects. These include the merger of li:l and 111 , /e:/ and Id, and 
/u:/ and Ivl before /1/ {peel/pill, sail/sell,fool/full)-, of /e/ and /as/ and h/ and /o:/ before 
/r/ {merry/marry; for/four)-, and of hi and Itl before nasals {pin/pen; him/hem). 

These variables of phonemic inventory, together with the systematic shifts in the 
phonetic realization of vowels to which they are structurally related, are the primary 
focus of an extensive survey of regional variation and change in North American Eng¬ 
lish now underway at the Linguistics Laboratory of the University of Pennsylvania, 
known as the telsur Project, directed by William Labov. During the 1990s, telsur 
used a combination of sociolinguistic interviews conducted over the telephone and 
computerized acoustic analysis to assemble a set of phonetic and phonological data 
on North American English that covers the entire continent, an achievement that had 
so far eluded the field survey tradition of American dialectology. Approximately 800 
speakers were interviewed, and over half of these interviews were subjected to acous¬ 
tic analysis. The results are now being compiled in an Atlas of North American English: 
Phonetics, phonology and sound change (Labov, Ash & Boberg, in press), which will 
contain maps representing the geographic distribution of the variants of each vari¬ 
able, together with analyses of the sound changes now in progress in North American 
English, and a taxonomy of North American English dialects based on acoustic pho¬ 
netic data. 

telsur relies on three complementary sets of data in its assessment of the pho¬ 
nemic contrasts present in each dialect. One is an acoustic analysis of the phonetic 
realization of each potential phoneme in spontaneous speech. The second is both 
acoustic and auditory analysis of minimal pairs in which the speaker is asked to say 
two words that differ only with respect to the contrast in question, thereby eliminat¬ 
ing all possible contextual influences on phonetic production. The third set of data 
comes from speakers’ judgments of the contrast involved in each elicited pair: do the 
words sound the same or different; do they rhyme or not? The most obvious compari¬ 
son to be made, then, is between the data representing the analyst’s observations of a 
particular phonemic contrast (the speaker’s production) and the speaker’s judgments 
of the same contrast (the speaker’s perception). 

5. ANALYST OBSERVATIONS VS. SUBJECT INTUITIONS IN MINIMAL PAIRS. Table 1 (over¬ 
leaf) shows data obtained from interviews with over 700 subjects across North 
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Perception 

(subject 

judgment) 

Production 

(analyst 

judgment) 

cot/caught 

Don/dawn 

pin/pen 

full/fool 

same 

same 

217 

30% 

292 

41% 

181 

25% 

53 

7% 

close 

same 

13 

2% 

18 

3% 

6 

1% 

4 

1% 

different 

same 

12 

2% 

6 

1% 

2 

0% 

2 

0% 

same 

close 

57 

8% 

25 

4% 

30 

4% 

20 

3% 

close 

close 

23 

3% 

39 

6% 

37 

5% 

30 

4% 

different 

close 

22 

3% 

22 

3% 

35 

5% 

24 

3% 

same 

different 

29 

4% 

21 

3% 

13 

2% 

14 

2% 

close 

different 

40 

5% 

31 

4% 

39 

5% 

38 

5% 

different 

different 

316 

43% 

253 

36% 

367 

52% 

528 

74% 

TOTAL 

729 

100% 

707 

100% 

710 

100% 

713 

100% 

Perception ahead 

126 

17% 

77 

11% 

82 

12% 

72 

10% 

Production ahead 

47 

6% 

46 

7% 

43 

6% 

30 

4% 

Total disagreements 

173 

24% 

123 

17% 

125 

18% 

102 

14% 


Table 1. Perception vs. production data for four minimal pairs, from sociolinguistic 
interviews with native speakers of North American English, tape recorded for the Atlas 
of North American English (Labov, Ash & Boberg, in press). 

America, focusing on three contrasts in four minimal pairs. The first contrast is 
between lal and h:l, first before It/ in cot vs. caught, and then before In/ in Don vs. 
dawn. The second contrast is between 111 and Itl before Ini, in pin vs. pen. The third is 
between /u/ and lu:l in full vs. fool. Both the subject’s and analysts judgments of the 
pairs (i.e., perception and production of contrast) are classified with three terms: same, 
meaning no difference in sound; close, meaning a marginal, uncertain or inconsistent 
difference; and different, meaning a clear and consistent difference. This gives us nine 
possible combinations of subject and analyst judgments, each of which is listed in 
the table. These can be grouped for purposes of analysis into cases of agreement or 
disagreement between subject and analyst. The cases of disagreement can be further 
divided between cases where the analyst judges the merger to be more advanced than 
the subject (production ahead of perception), and cases where the analyst judges the 
merger to be less advanced than the subject (perception ahead of production). 

The first point to be made about the data in Table 1 is that in the majority of cases, 
the subject and analyst do agree on the status of the phonemic contrast. However, the 
last line of the table shows that in a significant minority of cases, ranging from 14 per 
cent for the pair full-fool to 24 per cent for cot-caught, they do not agree. The author’s 
personal experience in evaluating thousands of minimal pair tests for the telsur 
Project suggests that many of the disagreements arise from perfectly trivial, non- 
linguistic factors like distraction, boredom, fatigue, insincerity, confusion or simple 
error on the part of subjects. These factors nevertheless play a part in determining the 
confidence with which we rely on native speakers’ intuitions about their grammar as 
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evidence in linguistics. Another important factor, contributing to the large number of 
close judgments in the perception column, is subjects’ failure or refusal to state their 
intuitions in appropriately precise terms. When asked if two words sound the same 
or not, many subjects answer something like, ‘they’re similar’, ‘they’re pretty close’, or 
‘yeah, they’re pretty much the same’, the exact meaning of which is often difficult to 
interpret. 

A further set of factors that interfere with the accuracy of subjects’responses arises 
from the observer’s paradox referred to above. Sometimes one or both of the sounds 
involved in a minimal pair is the target of negative social evaluation, causing speakers’ 
judgment of contrast to reflect their perceptions of the evaluative norms of the speech 
community rather than their own usage. Another effect of linguistic insecurity is that 
speakers who are self-conscious about their level of education may claim that there 
is a difference in sound between two words simply to demonstrate their knowledge 
that the words are spelled differently. Many subjects begin their response to a mini¬ 
mal pair question with, ‘well, they’re not spelled the same...’ Moreover, to a self-con¬ 
scious subject, it may seem safer to claim an ability to discriminate a small difference 
between two things even if it may not be there, than to admit an inability to discrimi¬ 
nate a difference that is potentially present. 

6. THE NATURE OF ANALYST-SUBJECT DISAGREEMENTS ABOUT PHONEMIC CONTRAST. 

If the factors just enumerated were the only ones operating to produce subject-ana¬ 
lyst disagreements about phonemic contrast, we would expect to find more or less 
random patterns of disagreement, and more or less similar rates of disagreement in 
each region. A closer analysis of the data from Table 1 shows that this expectation is 
not supported, and that a better understanding of the nature of asymmetries between 
perception and production can help us to assess the risks involved in relying on 
native-speaker intuitions in different situations. 

A comparison of the frequency of disagreements in which perception is ahead of 
production to the frequency of the opposite type of disagreement, production leading 
perception, suggests that the relationship between these frequencies is not random. 
In all four cases, involving three different mergers that affect different parts of North 
America, it is more common for perception to be ahead of production than the oppo¬ 
site. The bias in favor of perception leading production ranges from under 60% in the 
case of Don and dawn to a ratio of almost three to one in the case of cot and caught. 
This state of affairs is the opposite of the prediction that would follow from the above 
hypothesis that subjects feel safer claiming to hear doubtful distinctions than admit¬ 
ting that they cannot hear a difference between two words. In fact, subjects are on 
average twice as likely to fail to report a difference in production noted by the analyst, 
as to claim that they make a distinction that the analyst cannot hear. In other words, 
mergers appear to advance more quickly in perception than in production. 

These data support the observations of Labov (1994:310-70) that, as a general 
principle, socially and geographically expanding mergers affect the perceptual 
status of a contrast before they affect its productive status. Herold (1990) called this 
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phenomenon a merger by expansion. Contrary to the intuitive supposition that merg¬ 
ers represent a loss of information because words that were once distinguished can 
no longer be told apart, Herold’s study of the merger of the vowels in cot and caught 
in Eastern Pennsylvania (also discussed in Labov 1994:321-24) caused her to suggest 
that, from the point of view of speakers who have historically maintained a distinc¬ 
tion but who have recently come in contact with an expanding merger, mergers in fact 
represent a gain of information. This is because, under conditions of dialect contact, 
when speakers who themselves maintain a distinction hear the merged production 
of other speakers, they are liable to misunderstand the speakers with a merger as they 
try to match the phonetic production associated with a single phoneme to their own 
two-phoneme perceptual model. Speakers with a merger, by contrast, rely on contex¬ 
tual cues rather than sound differences to tell the two word classes apart. Mergers are 
irreversible (Labov 1994:311): once a speaker has lost the lexical distinction between 
two word classes, it cannot be reacquired except by a process of rote memorization 
that resembles second language acquisition (like English-speakers learning gender 
in French or German). Therefore, the only way to resolve the problem of misunder¬ 
standings is for the members of the community who still maintain a distinction to 
abandon that distinction at the perceptual level: to stop listening for a difference in 
sound and rely on contextual cues instead, like speakers with a merger. Once they 
have done so, misunderstandings diminish, which represents a gain of information. 
This merger in perception, of course, represents the first stage of an eventual complete 
merger in the affected community. The speakers with a perceptual merger will retain 
the distinction they acquired as children in their own production, leading to the sort 
of asymmetry reflected in some of the results of Table 1, but their children will likely 
exhibit a complete merger in both production and perception. 

The observation that perception-only mergers are more common than produc¬ 
tion-only mergers in the data of Table 1, and the association of perception-only 
mergers with the geographic diffusion of mergers in progress, suggests a further 
approach to these data. We might ask whether subject-analyst disagreements are 
equally common in all regions of North America, or whether they are concentrated in 
regions where mergers are known to be in progress. For an answer, we will focus our 
analysis on the pair from Table 1 that shows the highest rate of disagreement: cot vs. 
caught. A regional breakdown of response types for this pair is given in Table 2. 

Table 2 lists five dialect regions chosen for the diversity of their status with respect 
to the merger of cot and caught. The Inland North is the belt of industrial cities 
around the American side of the Great Lakes, such as Chicago, Detroit, Cleveland 
and Buffalo. The Mid-Atlantic region extends from New York City down to Philadel¬ 
phia and Baltimore. The West refers to the Mountain and Pacific Coast regions of the 
United States, including cities like Denver, Phoenix, Los Angeles and Seattle. Canada 
is self-explanatory. The Midland, for purposes of this analysis, is the strip of territory 
that extends between the Inland North and the South, from Columbus, Cincinnati 
and Indianapolis to St. Louis and Kansas City, telsur’s research has established that 
the Inland North and Mid-Atlantic regions exhibit a stable and consistent distinction 
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Perception 

(subject 

judgment) 

Production 

(analyst 

judgment) 

Inland 

North 

Mid- 

Atlantic 

West 

Canada 

Midland 

same 

same 

0 

0% 

1 

3% 

64 

74% 

25 

69% 

28 

22% 

close 

same 

0 

0% 

0 

0% 

3 

3% 

3 

8% 

1 

1% 

different 

same 

0 

0% 

0 

0% 

0 

0% 

2 

6% 

5 

4% 

same 

close 

0 

0% 

0 

0% 

9 

10% 

5 

14% 

l6 

13% 

close 

close 

0 

0% 

0 

0% 

4 

5% 

0 

0% 

11 

9% 

different 

close 

0 

0% 

0 

0% 

2 

2% 

1 

3% 

8 

6% 

same 

different 

0 

0% 

1 

3% 

0 

0% 

0 

0% 

8 

6% 

close 

different 

6 

9% 

1 

3% 

1 

1% 

0 

0% 

11 

9% 

different 

different 

64 

91% 

36 

92% 

4 

5% 

0 

0% 

40 

31% 

TOTAL 

70 

100% 

39 

100% 

87 

100% 

36 

100% 

128 

100% 

Perception ahead 

6 

9% 

2 

5% 

10 

11% 

5 

14% 

35 

27% 

Production ahead 

0 

0% 

0 

0% 

5 

6% 

6 

17% 

14 

11% 

Total disagreements 

6 

9% 

2 

5% 

15 

17% 

11 

31% 

49 

38% 


Table 2. Perception vs. production of the contrast between cot and caught, by dialect 
region, from sociolinguistic interviews with native speakers of North American English, 
tape recorded for the Atlas of North American English (Labov, Ash & Boberg, in press). 

between cot and caught, and the West and Canada an equally solid merger. The status 
of the distinction in the Midland, however, is much less clear. The major Midland 
cities are historically distinct, but several of them appear to be in the midst of a 
merger in progress, with older speakers maintaining a distinction and younger speak¬ 
ers losing or having lost it. This was found to be the case in Cincinnati, for instance, 
by Boberg and Strassel (1995). 

The regional analysis of Table 2 shows that there is a fairly clear relationship 
between the status of the merger and the frequency of subject-analyst disagreements. 
In three out of the four areas where phonemic contrast enjoys a stable, consistent 
status as either present or absent, disagreements are relatively infrequent, below the 
continental average of 24 per cent. This is particularly true where a solid distinction is 
maintained: minimal pair tests in the Inland North and Mid-Atlantic regions produce 
only nine and five per cent disagreements, respectively. Ask a speaker in Detroit or 
Philadelphia whether cot and caught sound the same, and you will very likely get a 
clear distinction in both production and perception. In fact, speakers in these areas 
are sometimes puzzled by the purpose of the question, and cannot imagine how the 
two words could possibly sound the same. 

Regions that are known to have stable and consistent mergers show much less 
certainty in judgment than regions with a solid distinction. It is not clear at this 
point why this should be so, unless it relates to the feeling of insecurity, mentioned 
above, that would lead speakers to claim that there is a difference in sound between 
two words because they know they are spelled differently, in order to avoid seeming 
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uneducated. Disagreements are relatively uncommon in the West, below the conti¬ 
nental average, but are surprisingly frequent in Canada, which may suggest that the 
Canadian merger is less recent than has commonly been supposed. 

The region with the highest frequency of disagreements, however, is clearly the 
Midland, at 38 per cent: more than a third of Midland subjects report intuitions about 
phonemic inventory that do not agree with the observations of the analyst. Given 
the large number of Midland subjects in the sample, this suggests that much of the 
subject-analyst discrepancy evident in the overall data was due to this single region. 

7. conclusions. Two conclusions can be drawn from the preceding observation. The 
first is that, in situations where the phonemic inventory is stable, native speakers’ intu¬ 
itions about it are in fact fairly reliable. The second is that, where the phonemic inven¬ 
tory is undergoing change, speakers’ intuitions about it are often at odds with the facts 
observed by a linguist. The causes of disagreement in the latter circumstances go well 
beyond trivial factors like attention and sociolinguistic factors like linguistic insecu¬ 
rity: they clearly involve a genuine confusion on the part of many subjects about the 
status of an element of linguistic structure that is subject to variation and change in 
their community. The very high frequency of subject-analyst disagreements in the 
Midland is clearly problematic for a linguistic methodology that depends on native 
speaker intuitions as its primary source of evidence. These data show that while it 
may sometimes be necessary or even desirable to turn to native speakers’ intuitions as 
evidence in linguistics, such evidence should always be interpreted with caution and 
even skepticism, and should be checked against empirical data whenever possible. 

We have focused here on the question of phonemic contrast, but it is not hard to 
imagine how the conclusions of this study would extend to other questions and levels 
of linguistic structure. Where elements of morphology or syntax are diachronically 
unstable, speakers’ intuitions about the grammaticality of those elements are likely 
to be equally unreliable. The unreliability, as indicated above, may arise from linguis¬ 
tic insecurity and awareness of the negative social connotations of certain ways of 
speaking, or from simple confusion and uncertainty caused by a mixture of compet¬ 
ing grammars in the speaker’s environment. 

We shall leave the last word on this subject to one of telsur’s informants, who 
illustrates brilliantly the confused judgments that are often produced by dialect 
mixture. What he says in the transcript that follows may seem amusing to English- 
speaking readers, because as English speakers ourselves we have access to a set of 
facts that clearly contradict his misinterpretation of variation in the lexical incidence 
of a phoneme as the basis of a semantic distinction. However, one can only imagine 
how much more difficult it would be to avoid being seriously misled by this sort of 
evidence, if one were studying a language one knew relatively little about, in a foreign 
culture. The speaker is a 35-year-old truck driver (telsur subject no. ts 116) from 
south-central Michigan, a traditionally Inland Northern region that experienced, in 
the decades after World War II, an influx of Midland and Upper Southern migrants 
looking for work in Northern factories, and therefore exhibits dialect mixture of the 
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sort we are interested in. The interviewer (the author) asks the subject to give a syn¬ 
onym for pigs, in an attempt to gather data on his pronunciation of the word hogs. This 
pronunciation varies regionally, along with that of many other words containing <o> 
before /g/, between the /a/ of cot and the h:l of caught. In the transcript that follows, 
we will write hog for the former, and hawg for the latter pronunciation. 

Interviewer: What’s another word for pigs ? 

Subject: Hogs. 

I: Yeah. 

S: Orhawgs. 

I: Right. Now which, how do you say that word? 

S: Depends on what kind; which one you’re talkin’ about. Uh, there’s some kinds 
that they, they pronounce 'em as hawgs, some kinds they, they pronounce as 
hogs. It’s like a different breed. Neighbor down the street, he’s a hog farmer, 
and he calls 'em hogs. Okay? Now, he also says there’s a breed that they call 
hawgs. And hogs. 

I: Are there different places where these breeds come from? 

S: Yup. 

I: Which one’s which? 

S: Uh, I don’t know. 
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over the past several decades 1 , syntactic research has been dominated by gen¬ 
erative linguistics, the main research technique of which are well-formedness judge¬ 
ments of sentences 2 . The resulting methodological difficulties and points of criticism 
have been amply documented in, e.g., Cowart (1997) and Schiitze (1996). However, 
given the large amount of harsh criticism that is frequently directed at naively gath¬ 
ered acceptability judgements (usually accompanied by commitments to corpus anal¬ 
yses; cf. many references cited in Schiitze 1996 and contemporary cognitive-linguistic 
and/or functionalist studies), it is surprising to see that there is only a relatively small 
number of studies that explicitly compare the ways in which different methodologies 
yield different (kinds of) data. Since my point is mainly methodological in nature, 
I decided to investigate a phenomenon that has already been thoroughly studied, 
namely what I will, for ease of exposition, refer to as the English genitive alternation. 


(1) 

a. 

the speech 

of 

the President 



b. 

NP 

Possessed 

of 

NP 

Possessor 

(= o/-genitive) 

(2) 

a. 

the President 

s 

speech 



b. 

NP 

Possessor 

s 

NP 3 

Possessed 

(= s-genitive) 


Many variables influencing native speakers choices of constructions have been identi¬ 
fied (cf., e.g., Altenberg 1982; Leech, Francis, and Xu 1994; and especially Stefanowitsch 
1997 for overviews); for practical purposes, I will concentrate on three only, namely: 

. the syllabic lengths of NP Possessor and NP Possessed (cf., e.g., Poutsma 1914) such 
that short NPs tend to precede long NPs (to be represented as short » long); 

• the animacy of the two NPs’ referents (cf., e.g., Poutsma 1914; Jespersen 1949; 
Hawkins 1981); 

• the (discourse-)givenness of the referents of the two NPs such that NPs 
encoding given referents tend to precede NPs encoding new referents (to be 
represented as given » new; cf., e.g., Altenberg 1980,1982; Standwell 1982). 

The different kinds of data to be discussed are: 

(i) intuitions from informed linguists representing the generative approach, where 
it often seems that the only informant is the investigating linguist himself; 
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Spoken data 

Written data 

Row totals 

o/-genitives 

75 

75 

150 

s-genitives 

75 

75 

150 

Column totals 

150 

150 

300 


Table 1. Composition of the sample of corpus data. 

(ii) corpus data (both spoken and written); 

(iii) acceptability judgements from linguistically naive native speakers. 

First, I will show how the results of the different kinds of data relate to one another. 
Second, I will show how, counter to popular reasoning, syntactic research benefits 
from the investigation of both carefully elicited judgements and balanced corpus data. 
Note once more that the focus is not on finding out something (new) about the 
genitive constructions—the analysis of the genitive, whatever results it may yield, is 
merely a means of making a methodological point. 

1. METHODS. 

1.1. informed linguists. In order to obtain informally-gathered intuition data from 
informed linguists, I presented several linguists with the variables’ proposed effects 
and some example sentences and asked them, on the basis of their intuitions as lin¬ 
guists and native speakers, to formulate generalizations 

• concerning the power of the variables in determining the choice of con¬ 
struction; 

• concerning the (frequency) distribution of the particular features under 
investigation and the existence of genitive types that are defined by signifi¬ 
cant co-occurrences of particular variables’values. 

1.2. corpus data. Using MonoConc Pro 2.0, the pseudo-random sample of genitive 
constructions given in Table 1 was drawn from the British National Corpus (bnc, first 
edition). Each instance of a genitive was coded with respect to the above variables, 
that is the syllabic lengths of the two NPs, the (degrees of) animacy of the referents 
of the two NPs, and the discourse-givenness of the referents of the two NPs 4 . On 
the resulting data, I carried out a multifactorial ancova in order to (i) estimate each 
variable’s impact on the choice of the genitive and (ii) investigate the expected two- 
way interactions of variables and the genitive construction (cf. section 2.2.1) 5 . Also, I 
determined the most significant clusters of variables describing typical genitives (cf. 
section 2.2.2). 

1.3. acceptability judgements. Given the six variables (three for NP Possessor and three 
for NP Possessed ) to be analysed, I developed a factorial token set (using Cowart’s 1997: 
48f. terminology), as shown in Table 2. Thus, for a fully factorial set, 2x4x4x3x3=288 
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Variable 

Levels 

Genitive 

of vs. s 

Animacy of the NP Possessor (A Possessor ) 

human, animate+non-human, 
concrete+inanimate, abstract 

Animacy of the NP Possessed (A Possessed ) 

human, animate+non-human, 
concrete+inanimate, abstract 

Length of NP Possessor (L Possessor ) 

relative to 

length of NP Possessed (L Possessed ) 

^Possessor ^ ^Possessed ^^ 

^Possessor — ^Possessed 
•^Possessor ^ ^Possessed ^ 


Table 2. Independent variables manipulated in the questionnaire. 

individual tokens had to be developed. To that end, I combined each of the genitives 
with each degree of animacy of both NP Possessor and NP Possessed and systematically varied 
the lengths of the two NPs as well as their referents’ degrees of givenness (by means of 
a sentence preceding the sentence with the genitive to be judged) 6 . In order to increase 
the likelihood of representative results, various controls were implemented. For exam¬ 
ple, since the frequency of linguistic elements can distort the results, the frequency of 
the nouns figuring in the genitives was controlled for by only admitting the 2.5% most 
frequent words of English (according to the Cobuild electronic dictionary E-Dict). Also, 
in order not to base the interpretation of the results on a single token set (results might 
then be due to individual lexical items only), a different though analogously designed 
token set was developed, yielding a total of 576 experimental items. Then, the list of 
experimental items was interspersed with 576 filler items of other syntactic construc¬ 
tions with varying degrees of acceptability. The questionnaire was standardised such 
that each subject received a different set of randomly ordered stimuli and fillers and the 
required judgement process was explained and exemplified. This included that the scale 
of grades to be used by the subjects was anchored only at its endpoints (cf. Schiitze 1996: 
189, n. 12; Cowart 1997:71). 

The subjects that participated in this experiment voluntarily were all native speak¬ 
ers of English without training in linguistics and unaware of the exact purpose of 
the analysis. The resulting acceptability ratings were then analysed using an(c)ovas 
in order to determine how each variable’s two-way interaction with the construction 
influences (or fails to influence) the acceptability ratings. 

2. RESULTS AND DISCUSSION. 

2.1. informed linguists. As to the first question (the degree to which the variables 
analysed influence the choice of construction), the results are fairly heterogeneous. 
The following rank-orderings of variables were obtained 7 : 
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( 3 ) 


a A > A > (1 

a. -^p OSS essor Possessed ^ v^Possessor ' 


^Possessed 


: D 


Possessed ' 


D 


b. D 

c. (Ap 

d. A Pn 


Possessor ^ -^Possessor ^ ^Possessor ^ ^Possessed ^ (-^Possessed ' 


Possessed/ 

^Possessed) 


^Possessed- 


:>(L P 


i) > (Dp 

Opncc 


^Possessed) ^ (^Possessed ^Possessed) 


,r)>A 


Possessed 


> (L 


Possessed ' 


: D 


Possessed/ 


Two tendencies can be observed: First, animacy is fairly consistently considered to 
be among the most important determinants of the constructional choice. Second, 
NPpossessor i s > on average at least, considered to be important and NP Possessed is not. Note 
also, however, that there is an interaction such that if NP Possessed is important, then it 
is only in terms of its degree of animacy. On the whole, however, the results are het¬ 
erogeneous: there is no consistent ranking of variables or NP kinds and we find that 
variables equated by some linguists are not equated at all by others. 

As to the second question (the frequency distribution of features co-occurring 
[frequently/significantly]), the results were fairly homogeneous. Consider (4) and 
(5) for the feature clusters (for o/-genitive and s-genitive respectively) claimed to 
be prominent (blanks indicate that the respective variable was not included in the 
expected significant type by the informants). 


(4) 

a. 



animate 

NP 

Possessed 


b. 

long 

new 


NP 

Possessed 


c. 


new 

abstract 

NP 

Possessed 

(5) 

a. 

short 


animate 

NP 

Possessor 


b. 

short 

given 

(animate) 

NP 

Possessor 


c. 

short 

given 

human 

NP 

Possessor 


d. 



inanimate 

NP 

Possessed 


These proposals as to frequency distributions of feature clusters also yield interest¬ 
ing results. First, linguists’ estimations concerning the s-genitive and the o/-genitive 
focussed on NP Possessor and NP Possessed respectively. This is somewhat surprising since 
both genitives obviously consist of NP Possessor and NP Possessed , and I do not know how 
to explain this unanimous focus on one NP in each construction. Second, possessors 
in s-genitives are in general considered to be short, given, and animate (thus support¬ 
ing the predictions of given » new and short » long). On the other hand, NP Possessed in 
o/-genitives is supposed to be long and new (with disagreement concerning animacy). 
This, however, ties in with the predictions concerning NP Possessed of the s-genitive 
since a long and new NP Possessed in o/-genitives violates both short » long and given » 
new. In other words, giving even such a simple constellation of variables and expected 
effects to experienced linguists seems to pose computational problems such that the 
subjects ultimately failed to account for the predicted two-way interaction and pro¬ 
duced unexpected and contradictory predictions. Finally, the results of both the vari¬ 
able ranking and the expected feature clusters do coincide to some extent in that both 
strategies lead us to expect that NP Possessor is more important than NP Possessed . 
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A 

^Possessor 

abstract 

concrete 

human 

Row totals 

A 

■^Possessed 

of 

s 

of 

s 

of 

S 

of 

s 

total 

abstract 

80 0 

37 (') 

9 

8 

3 

2 

92 

47 

139 

concrete 

22 ( + ) 

0 (') 

20 ( + ) 

1 () 

0 

0 

42 

1 

43 

animate + human 

9 (-) 

58 ( +++ ) 

1 ( -) 

35 ( +++ ) 

6 

9 

16 

102 

118 

Column totals 

111 

95 

30 

44 

9 

11 

150 

150 

300 

206 

74 

20 


Table 3. Genitives relative to animacy in the corpus data (as a (3 x 3) x 2 table) 9 . 


2 . 2 . CORPUS DATA 

2.2.1. variable strengths. As a first step, before we look at the individual variables’ 
effects, let us look at whether the variables singled out for attention correlate with 
the choice of genitive constructions in the data in any way worth mentioning. With¬ 
out belabouring statistical technicalities, the overall correlation is fairly high and 
highly significant, showing that the variables included in the analysis indeed contrib¬ 
ute strongly to the alternation 8 . 

Let us first look at the impact of animacy on the choice of genitive constructions. 
Consider Table 3, which provides the frequencies of each genitive construction 
depending on A Possessor and A Possessed . The distribution of constructions is, as can be 
easily seen, different from chance (R mult =.64; F 7 ^=29.3; p<.ooi). The cells respon¬ 
sible for this effect contain plusses/minuses (depending on whether the observed 
frequency is higher/lower than the expected one), the numbers of plusses/minuses 
indicate the significance level of the cells’ deviations from the expected frequencies as 
determined by a configural frequency analysis (cf. Krauth 1993). 

On the level of row and column totals, two results are immediately obvious: first, 
animate/human NP Possessed s are rare and the more human an NP’s referent is, the less 
likely it is to occur as NP Possessed . That humans are rarely NP Possessed is, on the one hand, 
not surprising, given how we conceptualise possession (cf. Taylor i995:202ff., 1996). On 
the other hand, it is interesting to note in passing that 206 out of 300 genitive construc¬ 
tions (nearly equally of- and s-genitives) have an abstract entity as NP Possessed rather than 
a concrete object (as would be expected from such prototype-based approaches to pos¬ 
session and genitives in English). No similarly clear bias, however, can be observed for 
NPpossessor" animate and human possessors occur often (though abstract possessors are 
most frequent) and concrete possessors occur only rarely. 

Let us finally turn to significant individual (pairs of) cells and, thus, two distinct 
usage patterns of genitive constructions. On the basis of the data, two significant pat¬ 
terns of genitive usage can be identified. 

(6) NP abstract of NPabstract / NP concrete 

(-,) NP S 

V/ / animate/human ° 


abstract/concrete 
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On the one hand, the o/-genitive is significantly preferred when both NPs are abstract, 
when both NPs are concrete, and when the NP Possessor is concrete and NP Possessed is 
abstract. On the other hand, the s-genitive is preferred when NP Possessor is animate/ 
human. These patterns are so strong that, once we know what NP Possessor looks like, 
we can predict (92+42+102) 236 out of 300 (78.7%) genitive constructions correctly. 
NPpossessed* however, does not play a prominent role when it comes to deciding on a 
construction. 

Space limitations do not permit detailed inspection of the corpus data with respect 
to the types just mentioned (cf., however, section 2.2.2 below), but a brief comment 
will serve to indicate the ways an analysis could be continued rewardingly. One such 
possibility is the analysis of semantic relations between the two NPs involved: The 
pattern in (6) admits a variety of semantic relations between NP Possessor and NP Possessed 
such as attribute/holder of attribute, part/whole, etc.; (cf. Stefanowitsch (1997) for an 
illuminating inventory of relations and their distribution) whereas the semantic rela¬ 
tions of the pattern in (7) are most often that of possessor/possession, agent/action 
and attribute/holder of attribute. That is, even a cursory glance at real data shows 
the implausibility of assuming that the two constructions are synonymous or used 
interchangeably; this implies that, at least on the basis of our data, there is no need to 
derive one construction from the other in any way whatsoever. 

The next variable to be investigated is concerned with the syllabic lengths of the 
two NPs involved in the genitive. According to previous studies, we would expect to 
find a two-way interaction between the NP (NP Possessor vs. NP Possessed ) and the geni¬ 
tive construction (o/-genitive vs. s-genitive) such that short » long. A 2-way anova, 
however, shows that the overall correlation between the kinds of NP and genitives is 
significant (F, ^=2.97; p=.03i), but not in ways we would expect: 

• there is a significant main effect such that the two genitives differ with 
respect to the average lengths of the NPs: o/-genitives are formed out of 
longer NPs than s-genitives (Fj ^=7.14; p=.oo8); 

• the predicted two-way interaction is insignificant (F^ 596 =-7; p=-405) and 
the observed tendency is even in the opposite direction of what syntactic- 
weight approaches would predict; cf. the left part of Figure 1. 

That is to say, approaches to the genitive placing a strong emphasis on heaviness of 
constituents are not supported by the data, a result I found somewhat astonishing. But 
before we jump to conclusions too hastily, recall that many analyses of corpus data 
are based on written data only - the present corpus, however, is balanced with respect 
to the medium so we can easily filter out this effect. Consider Table 4, where (within 
each medium and across all examples) for each construction the average lengths of 
NPpossessor a nd NP Possessed are compared. 

A 3-way anova including the medium (spoken vs. written) yielded two results 
worth further discussion. First, the analysis revealed that the NPs in the written part 
of the corpus are on average significantly longer ( 1 / .^=11.96, pc.001). Second, and 
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oral 

written 

total 

o/-genitive 

s-genitive 

o/-genitive 

s-genitive 

o/-genitive 

s-genitive 

NP 

Possessor 

3-7 

3-1 

4 

3-9 

3.8 

3-5 

NP 

Possessed 

3-1 

3 

4-5 

3.2 

3.8 

3-1 


Table 4. Average NP syllabic lengths of the genitives in the corpus data. 


5 r 


ci 3 


1 - 


. .o/rgenitive.. 


-_ s :genitive~ 


3.8 

3.11 


Possessor Possessed 

Kind of NP 



Figure 1. Interaction plots: (Genitive x NP)for lengths (left) and (Genitive x NP)for 
DTLM (right). 


more interestingly, there is a significant 3-way interaction (Fj ^=4.48, p<.034) such 
that: 

• for the written data, the two-way interaction is even more in the unexpected 
direction; 

• for the oral data, the two-way interaction is nearly as expected: with o/-gen- 
itives, NP Possessor is longer than NP Possessed - with s-genitives, there is practi¬ 
cally no difference. 

That is to say, we must be careful not to leave aside medium-specific differences: the 
results for written and oral data diverge so strongly that the unexpected overall results 
may hide the expected results of the oral data, if the medium is not accounted for care¬ 
fully. This is an important lesson to learn for corpus-based analyses of syntactic phe¬ 
nomena, especially when one tries to account for syntactic phenomena in terms of 
processing restrictions or similar variables where medium differences can be decisive. 

Finally, let us deal with the discourse givenness of the two NPs and their effect on 
the choice of genitive. Consider Table 5 (overleaf), where the average values of the 
distance to last mention (dtlm) in clauses are given. 

Again, previous studies lead us to expect a two-way interaction between NP 
(NPpossessor vs. NP Possessed ) and the genitive construction (o/-genitive vs. s-genitive) such 
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oral 

written 

total 

o/-genitive 

s-genitive 

o/-genitive 

s-genitive 

o/-genitive 

s-genitive 

NP 

Possessor 

5-9 

7-3 

8 

6.3 

6.9 

6.8 

NP 

Possessed 

7-2 

9 -r 

8.7 

7.6 

8 

8.4 


Table 5. Average dtlm scores of the NPs in the genitives of the corpus data 

that given » new. Again, however, the 3-way anova (this time including the medium 
right from the start), though highly significant (F =7.7; p<.ooi), shows that: 

• the two genitives differ with respect to the average givenness of their NPs 
such that the average NP Possessor is more given than the average NP Possessed 
(¥ h596 =20. 57 ;p<.ooi); 

• the predicted two-way interaction is not significant (F 1 i596 =-95; p=-329 ns), 
but the observed tendency is indeed in the predicted direction; cf. the right 
part of Figure 1. 

While the second result is easy to account for (since it is, though non-significant, at 
least in the correct direction) I find it difficult to account for the first one. An explana¬ 
tion might be that we simply speak about possessors more often since, as we have seen 
above, they tend to be human. If we speak about them more often, then of course the 
distance between the different occasions on which we refer to them are closer to one 
another, resulting in the observed main effect of dtlm. It remains to be seen to what 
extent the analysis of the acceptability judgements can shed light on this issue. 

2.2.2. TYPES OF GENITIVES AS DETERMINED BY SIGNIFICANT FREQUENCY. While the 

previous section has investigated each variable on its own, let us now look at the geni¬ 
tive types defined by significant feature clusters of all variables simultaneously. While 
the overall number of significant types (as determined by a hierarchical configural 
frequency analysis) is too large to be discussed in detail, the most important types for 
o/-genitives and s-genitives are given in (8) and (9) respectively (the interval variables 
[length and dtlm] were dichotomised on the basis of their arithmetic mean within 
each register). 


(8) 

a. 

NP 

concrete new 

short 

of 

NP 

concrete new 



b. 

NP 1 

abstract new 


of 

NP 

1 y A human/animate 

short 

(9) 


NP 

human/animate 

short 

s 

NP 

concrete 

short 


On the whole, the types already obtained by the analysis of A Possessor and A Possessed 
alone are supported—given the above corpus results on length and givenness, it is not 
surprising to see that the identifiable types do not unanimously support the expected 
tendencies (short » long and given » new). 
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Animacy Possessor 


Animacy Possessed 


Figure 2. Interaction plots: Genitive x A Possessor (left) and Genitive x A Possessed (right). 


2.3. acceptability judgements. As we have seen, the corpus analysis yielded fairly 
heterogeneous results such that some previous accounts were supported whereas 
some tendencies one might have taken for granted were not. Let us now turn to the 
results of the survey and find out to what extent the results fit together. First, again, 
the overall correlation between all variables and the choice of construction is highly 
significant 10 . We will now proceed in the same order as for the corpus data and start 
with Ap ossessor 

While no significant main effect can be found, for the two-way interaction between 
the genitive and A Possessot , we obtain a clear and significant pattern (F 3 , 499 =7,i5; p<.ooi): 
animate and human possessors are preferred in the s-genitive whereas abstract as well 
as concrete and inanimate possessors are preferred in the o/-genitive. These findings 
are virtually identical to and, thus, strongly support the results obtained in and inter¬ 
pretations derived from the corpus analysis (cf. the left part of Figure 2) 11 . 

A somewhat different picture emerges from the analogous analysis of A Possessed . 
First, there is a significant main effect showing that the more human NP Possessed is, the 
less acceptable are both constructions (F } 498 = 6.i8; p<.ooi), an effect we are already 
familiar with from the corpus data. More importantly, however, is the (significant) 
two-way interaction (F 3 498 = 9-09; p<.ooi) between the genitive constructions and 
^Possessed- I 11 the corpus analysis, NP Possessed does not differentiate between the two 
constructions. The results of the acceptability judgements support these results for 
abstract and concrete NP Possessed , which again obtain virtually identical ratings in 
both constructions. However, animate NP Possessed are preferred in s-genitives, whereas 
human ones are preferred in o/-genitives. This is interesting in two respects: first, it 
shows that there is a strikingly high general coincidence of corpus and judgement 
data. Second, it shows that, where the corpus data have not provided relevant infor¬ 
mation (recall no cases of animate NP Possessed were found), the judgement data help us 
to describe the constructional preferences in such cases (cf. the right part of Figure 
2) 12 . (Space does not permit the discussion of the marginally significant interaction 
Genitive x A Possessor x A Possessed .) 
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Let us now turn to the lengths of NPs and degrees of givenness of the NPs’ refer¬ 
ents. In sum, the results of the questionnaire study again support those of the corpus 
analysis: while there is a tendency in the direction of short » long, the two-way 
interaction between the length of the two NPs and the construction clearly fails to 
reach standard levels of significance (F 2 j500 =i- 53; p=-72). Thus length does not seem 
to play a role in the choice of construction. With distance to last mention the situa¬ 
tion is slightly different: the interaction between the distances to last mention and 
the construction is significant (F 2 500 =3.28; p=.039), such that there is a tendency of 
given possessors to be preferred in the s-genitive. This is, however, only a tendency, 
as post hoc tests (Scheffe) reveal no significant differences between the six arithmetic 
means (as opposed those of A Possessor and A Possessed ). However, we still need to test 
whether the main effect noted above (the average NP Possessor is less given than the 
average NP Possessed ) has been verified experimentally. In accordance with the corpus 
data, there is in fact an non-significant tendency in this direction (F 2 =1.81; p=.i7): 

s-genitives are preferred when NP Possessor is more given than NP Possessed . Thus, while 
the two kinds of results are as yet inconclusive, the a posteriori hypothesis I proposed 
above could at least be explanatorily adequate. 

3. CONCLUSION. 

3.1. interim summary. The intuitions of informed linguists did not convey a unified 
picture: while we find agreement between the importance of variables and NP types, we 
obtain contradictory results for the frequent/typical clusters to be expected. The results 
of the corpus analysis are highly heterogeneous in how the results relate to previous 
approaches or more general predictions, something often found once natural data are 
analysed. Still, though, the corpus data have proven useful in several respects: variables 
could be weighted according to their importance for the alternation, it was possible to 
identify constructional types, and we saw how the neglect of medium differences can 
influence (not to say, distort) the results. On the whole, the corpus data correspond to 
the experimental acceptability judgement data. For most of the variables, virtually com¬ 
plete overlap between the kinds of results was found and, in the case of A Possessed , the 
judgement data even add precision to the corpus findings. 

Let us now turn to the more central question, that of how these results relate to the 
linguists’ intuitions? On the positive side, we find that the informants’ expectation as to 
the relevance of the variables was, though far from unanimous, accurate, at least to some 
degree: A Possessor is indeed the strongest variable determining the choice of construction. 
Also, the intuitions that (i) NP Possessor of s-genitives would frequently be animate/human 
and short as well as (ii) NP Possessed of o/-genitives would frequently be new (counter to 
discourse-functional predictions!) are borne out by the corpus data. On the (I believe 
somewhat stronger) negative side, however, we find that, on the whole, the linguists 
failed to predict: 

• the complete overall irrelevance of length and givenness to the choice of 
construction that was found in both the corpus data and the acceptability 
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judgements (recall the comparisons of means in the anovas) as well as the 
relative irrelevance of A Possessed to the choice of construction; 

• the relevance of the difference between animate and human NP Possessed found 
in the judgement data; 

• the fact emerging from the corpus data that NP Possessed tends to be abstract 
in both constructions 13 . 

These results, I submit, strongly support the claim that informed linguists’ intuitions 
on (syntactic) phenomena are inadequate. Obviously, such intuitions can serve as a 
good, easy-to-obtain and (at times) even accurate starting point of the analysis, but 
the analyst must be willing to (i) discard every single working hypothesis in the light 
of evidence to the contrary and (ii) integrate the more fine-grained information of 
corpus data and methodologically sensible questionnaire studies into his account. 
The following section addresses this issue in slightly more detail. 

3.2. conclusion (and a guideline). Given the course of the analysis, I believe the 
following conclusions are warranted. On the one hand, individual intuitive data may, 
but need not, provide valuable insights into a phenomenon. Given the overwhelm¬ 
ing empirical evidence pointing to potential threats to the objectivity, validity and 
reliability of intuition data thus obtained, however, I believe that empirically more 
sensible strategies are required. On the other hand, simply abandoning acceptability 
judgements in general seems premature, to say the least, since, once gathered in sci¬ 
entifically appropriate ways, they strongly coincide with or even improve on the often 
desired alternative of corpus data. (For a completely different study where equally 
refined judgement data are compared to corpus findings with similar results, cf. Gries 
ms.) Note especially that this coincidence of results has been found for cases where 
variables have turned out to be important and cases where variables turned out to be 
unimportant. 

In sum, on the basis of the above results and the conclusions that can be drawn 
from the empirical process as such, I suggest the following strategy (of methodologi¬ 
cally different but converging evidence) to incorporate all the above methods in a 
single methodology for a thorough analysis of syntactic phenomena. This strategy 
does not totally abandon naively collected judgement data, but rather treats them as 
a heuristic exploratory device, the implications of which are subjected to a wide array 
of methodologically more reliable strategies. 

(i) Collect ideas of what variables influence the phenomenon under investiga¬ 
tion on the basis of relevant literature as well as introspective data (including 
peoples intuitions) and formulate hypotheses; 

(ii) obtain carefully-balanced corpus data (recall the effect of the medium) rele¬ 
vant to the phenomenon under investigation in order to (a) perform explor¬ 
atory data analysis and (b) gather evidence bearing on one’s hypotheses; 
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(iii) depending on the results of step (ii), conduct methodologically sound 
experiments (i.e. conforming to standards outlined in Cowart 1997 and 
Schutze 1996) on those aspects of the phenomenon for which (a) no corpus 
data could be obtained and/or (b) one’s hypotheses were not supported 14 ; 

(iv) repeat steps (ii) and (iii) until you obtain mutually confirming results or 
identify additional factors. 

One important question remains, however: what do we do when the different strat¬ 
egies (e.g. corpus data and judgements) do not yield converging evidence? That is, 
if there is no single a priori hypothesis in support of a particular interpretation or if 
each of the two different results can be explained with reference to two mutually 
exclusive hypotheses, then which of the results (and hypotheses) should be preferred 
and on what grounds? 

The ultimate answer to this question is probably contingent on a variety of factors 
(such as personal taste, preference for methods of data collection and evaluation, the 
willingness to admit that the contradictory results cannot be reconciled at present). I 
would advocate accepting the hypotheses whose supporting results have been obtained 
most naturally. In other words, if results from corpus data contradict results from 
acceptability judgements and both could be explained equally well but differently, I 
would always tend to accept the hypothesis supported by the corpus data: the produc¬ 
tion of linguistic utterances/texts that happen to end up in a corpus occurred under 
completely natural circumstances and is, thus, less likely to be subject to experimental 
bias than questionnaire data (and many other experimental designs). Moreover, I would 
in general consider corpus data to be more precise in the sense that factors such as 
register, prescriptive attitudes and medium can be filtered out, whereas we can never 
be sure to what extent they influence subjects’ reactions in experimental settings (even 
if subjects are advised not to let such factors influence their reactions). Nevertheless, I 
hope (i) to have shown how, counter to common criticism, careful experimentation by 
means of acceptability judgement data can support our analysis of linguistic phenom¬ 
ena and (ii) that these findings stimulate further research of this kind. 


I thank Hans Boas (University of Texas at Austin), Verena Gries (Unilever Germany), Bar¬ 
bara Lohse (University of Southern California) and Debra Ziegeler (University of Man¬ 
chester) for their help in obtaining judgement data (by forwarding questionnaires) to be 
discussed in what follows. Also, my thanks go to Constanze Buhner of Southern Denmark 
University for helping me encode the corpus data and all colleagues participating in my 
experiment, even though they might have guessed that the results should show the inad¬ 
equacy of linguists’ intuitions. Finally, I am indebted to Heike Wagner (University of Ham¬ 
burg) and the Institut for Fagsprog, Kommunikation og Informationsvidenskab at SDU 
for providing computer equipment and assistant funding respectively. Without the kind 
assistance of all of these people, the huge amount of data necessary for this study could 
not have been obtained in time. 

Finally, let me note that some of the judgement results have slightly changed since the 
time of the presentation in Montreal. This is due to the fact that additional questionnaire 
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data reached me only after my return. However, in all cases but one (where results have 
undergone a slight change), the results have not changed at all in the light of these addi¬ 
tional data. 

2 In general, two kinds of well-formedness judgements are distinguished, namely gram- 
maticality judgements and acceptability judgements (i.e. judgements concerned with 
competence and performance respectively). My study is concerned with acceptability 
judgements only. However, I believe that both kinds of judgements are difficult to distin¬ 
guish on a principled basis since, e.g., different versions of generative grammar do not 
always agree on what factor is a matter of competence or performance. For instance, the 
introduction of semantic concepts such as theta roles into generative grammar enables 
generative grammarians to claim that particular semantic phenomena (i.e. phenomena 
outside of the grammar) can suddenly be explained in grammatical terms. 

3 I will use the expressions NP Possessor and NP Possessed throughout the remainder of the paper 
for expository reasons although in many cases it is not (prototypical) possession that is 
denoted. 

4 The degree of animacy of the NPs’ referents was measured using the following scale: 
human > animate and non-human > concrete and inanimate > abstract. The discourse- 
givenness of the NPs’ referents was measured using the distance to last mention (dtlm) 
of the referent in the preceding ten clauses. For the purposes of this analysis, expressions 
qualified as clauses when they contained a noun phrase or a clause as a grammatical sub¬ 
ject together with a finite verb; when they were participial or gerundival clauses (e.g., 
the non-italicised part in The new rules forbid more than one to put up a sign, a rule usu¬ 
ally ignored); or when a new conversational turn started. However, in order not to be too 
overly restrictive and proceed with too little context, the following cases were not counted 
as clauses even if they met one or more of the above-mentioned criteria: question tags; 
discourse markers such as you know, as it were, I mean-, cleft sentences and false starts. 

5 We need to analyse interactions rather than main effects because of the different orders 
of NP types (NP Possessor vs. NP Possessed ) in the constructions. For example, the preference 
short » long means that possessors should be short and long in the case of s-genitives and 
o/-genitives respectively, a paradigm case of a two-way interaction. 

6 It is well-known that there are also semantic restrictions on the use of the two different 
genitives. While these semantic variables are not focused upon in the present study, one 
still needs to take them into account so as not to bias the results systematically. In order 
to avoid such a skewing in the data, wherever possible I preferred semantic relations 
between the two NPs that, according to previous corpus-based analyses (Stefanowitsch 
1997, to appear), are known to occur in both genitives; such examples include possessor/ 
possessed, component/whole, attribute/holder of attribute, location/thing at location and 
family relations. 

7 In the representations of variable strengths in (3), “>’ and mean ‘is more important than’ 

and ‘is equally important as’; parentheses are used to support the grouping of similarly 
influential variables visually. 

8 Rmuit = - 65 > Fi 4 ,286 =1 5-33; pc.ooi; the analysis was an ancova (Type VI sums of squares, no 
constant, sigma-restricted model). 

9 Animate and human possessors were subsumed under a single value because there were 
only very few animate possessors and no animate possessed at all. 

10 Rmuit = - 77 i F238 268 =1 -^5> P<-ooi; the analysis was an ancova (Type VI sums of squares, no 
constant, sigma-restricted model). 
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11 This behaviour of human and animate NP Possessor s provides post hoc support for grouping 
these classes together. 

12 Note also that the acceptability judgements show that, while human and animate NP Possessor 
behave identically, human and animate NP Possessed do not. 

13 Also, the linguists formulated no register-/medium-specific predictions. Admittedly, I did 
not ask for those, but it is plausible to assume that the heterogeneity of the above results 
would not have been resolved by asking the linguists to include even more information in 
their already very heterogeneous intuitions. 

14 Needless to say, I do not advocate experiments where acceptability is the only dependent 
variables. Alternatives involve operation and selection tests (Quirk & Svartvik 1966), read¬ 
ing and reaction time studies, ambiguity tests, paraphrasing and many more. 
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when an individual immigrates to another country where there is a different 
mother tongue, he/she stands a good chance of losing some first language skills. This 
loss is called language attrition and has been studied in many populations of hearing 
people who use spoken language. Since research has illustrated beyond a doubt that 
sign languages are true languages in every sense of the word, the following question 
arises: if attrition can occur in a spoken language, why would it not occur in a sign 
language too? It is this question that motivated my research. 

I investigated the first language attrition of Russian Sign Language (rsl) among 
deaf immigrants to Israel from the Former Soviet Union. This is the first systematic 
investigation into the attrition of sign language. Evidence for attrition is provided as 
observed through interference and lexical gaps. This evidence can further be broken 
down into three patterns: 1) attrition-related behaviour that has previously been doc¬ 
umented in the study of oral languages, 2) attrition-related behaviour unique to sign 
language, and 3) non-linguistic behaviour. 

1. the community. This research was made possible by an unprecedented situation 
of mass immigration. The collapse of the Soviet Union has brought one million Jews 
to Israel since 1988 (Remennick 1998:445). These immigrants have changed the face 
of Israeli society, demographically, culturally, and linguistically. Every sixth person in 
Israel today has Russian as their mother tongue. Russian is the most common lan¬ 
guage after Hebrew and Arabic. Many Russian immigrants are reluctant to give up 
their language and culture. Their reluctance to integrate and surrounding intolerance 
have led to a subculture within Israeli society. This has been termed the ‘self-isolation 
or ‘sociocultural ghettoization of Russians’. (Zilber 1977, cited in Reminnick 1998:25). 

Among these newcomers are an estimated 1000 deaf people. Deaf immigrants 
from the former Soviet Union are in a position quite unlike the aforementioned 
majority of (hearing) Russian immigrants. Their situation encourages acculturation 
and assimilation. The small population is dispersed, with little political representa¬ 
tion. There is a lack of rsl institutions and education of the deaf in Israel disregards 
rsl; there is little, if any, ethnic awareness. The limited number of employment oppor¬ 
tunities available require Israeli Sign Language (isl) or Hebrew, not rsl. In addition, 
deaf Russian immigrants bring with them a low and weak self-image, both of their 
community and as individuals. 
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2. my research question. I hypothesized that with little opportunity to use and 
maintain rsl, the second sign language, (isl), would encroach upon the domain of 
the first sign language, (rsl), the outcome being first language attrition, similar to the 
first language attrition seen in oral languages when one language displaces another. In 
this study I looked for evidence of attrition and at the forms attrition took. Further¬ 
more, I hypothesized there would be more attrition among immigrants who had been 
in Israel for a longer period of time and more amongst those who had immigrated as 
children, as opposed to adults. 

3.1. theories used to investigate attrition. These hypotheses were investigated 
through two theories, first, the Theory of Interference, second, the Model of Ethno- 
linguistic Vitality (Allard & Landry 1986). 

3.2. the theory of interference. The Theory of Interference is one of the most 
widely held explanations of forgetting a language. It assumes that there will be a 
reduction in the linguistic system,based on language conflict. (Freed 1980, Lambert & 
Freed 1982). It posits interplay between the two languages, in this case rsl and isl, 
and predicts that the existing patterns of li are modified, mapped, and reorganized 
in favour of L2, i.e., in the direction of the now-dominant language from the non¬ 
dominant language. Items in li are eventually lost, replaced by items in L2. 

3.3. the model of ethnolinguistic vitality. The Model of Ethnolinguistic Vital¬ 
ity, as proposed by Allard and Landry (1986) predicts the degree of language loss 
based on sociological and psychological factors (Allard & Landry 1992). The model 
focuses on the group and in this case, on those variables that will contribute to the 
disappearance of deaf Russian immigrants as a separate entity, culturally and linguis¬ 
tically. Linguistically, the result is the development of the second language to the det¬ 
riment of the first. This model is viewed on three levels: 1) the social level (composed 
of demographic, political, cultural, and economic capital), 2) the socio-psychological 
level (which includes interpersonal contacts, contact with the media, and educational 
support), and 3) the psychological level (made up of the individuals language aptitude 
and‘cognitive affective disposition, or in other words, the individual’s opinions, judge¬ 
ments and assessments of their own personal situation) (ibid:i7i). 

4. methodology. This study focused on the lexicon, the most widely and commonly 
studied and most easily observed element of language attrition studies. It was limited 
to frequently used lexical items, based on the widely held view that high frequency 
vocabulary appears to be vulnerable in attrition. 

Twenty deaf immigrants whose li is rsl were located as subjects. All were deaf 
from birth or medically and legally deaf who had lost their hearing at an early age. 
All had received a typical deaf education by Russian standards, which was residential 
schools for all. The subjects, eight males and fourteen females, formed a cross section 
of the deaf immigrant population. They ranged in age from 11-60 upon immigration, 
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and had been in Israel for between one and twelve years. The number of subjects is 
too small for statistical generalizations and this research remains within the frame¬ 
work of individual loss. 

There were three stages. There was a preliminary self-report questionnaire fol¬ 
lowed by two experimental situations. The self-report questionnaire included demo¬ 
graphic information and questions about attrition and attitudes toward both rsl and 
isl. In addition to providing essential background information, it allowed subjects 
to express themselves in ways that may not have been observed in the tests. Both 
of the tests had been used previously in oral language attrition studies (Lambert & 
Freed 1982, Waas 1996). Both were applicable to sign language. In Test # 1, subjects 
were shown 22 large, colourful illustrations of everyday items on cards. They were 
instructed to sign what they saw in the picture. In Test # 2, subjects had sixty seconds 
in which they had to sign as many different animals as possible. Subjects signed to 
a native speaker of rsl and were recorded on videotape. The data was analyzed by 
three different professional interpreters of rsl, coming from three different parts of 
the former Soviet Union (to take dialect variation into account), and by an interpreter 
of isl, all working independently of one another. A similar but smaller group of hear¬ 
ing Russian immigrants completed the same questionnaire and tests. 

5.1. results. The results confirmed the hypothesis that language attrition would 
occur—and indeed it did. The self-report questionnaire confirmed a situation of lan¬ 
guage conflict. While rsl was dominant in domestic situations, isl was dominant 
in social situations and employment. The majority of subjects admitted, some reluc¬ 
tantly, having experienced attrition of rsl since their arrival in Israel. 

5.2. interference and lexical gaps, isl clearly interfered with rsl. For example, in 
signing ‘polar bear’, instead of signing BEAR WHITE as it is in rsl, the subject signed 
BEAR [OF THE] NORTH, a literal translation from isl signed in rsl. Gaps in the lexi¬ 
con also appeared, for example, with the picture of a rainbow. One subject declared that 
there is no sign for rainbow in rsl and proceeded to finger spell the word. Finger spell¬ 
ing, an orthographic representation of Russian letters spelled out on the fingers, is not 
considered true sign language. Another subject, having forgotten the sign for rainbow, 
produced an incorrect sign related to neither rsl nor isl. The remaining subjects and 
the interpreters confirmed that an rsl sign for rainbow does exist. 

5.3. ATTRITION-RELATED BEHAVIOUR PREVIOUSLY OBSERVED IN ORAL LANGUAGE. The 

deaf subjects studied exhibited behaviour previously documented in research on oral 
attrition. There was complete borrowing; there were blends. Compelled to produce 
something and lacking access to the correct form, subjects produced some sort of 
hybrid, often a nonsensical form. This parallels the lexical innovations recorded in 
oral languages. Paraphrasing was used to compensate for language loss, as was the 
substitution of a target word by a more general or semantically similar word or con¬ 
cept, e.g., TAXI for bus, BRIDGE for rainbow and KETTLE for teacup. There was 
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transfer based on semantic and phonological features. Phonological errors appeared 
an incorrectly formed signs, with the wrong hand configuration, location, movement 
or orientation. Subjects also confused one- and two-handed signs. 

5.4. linguistic behaviour unique to sign language. Linguistic behaviour unique 
to sign language consisted of finger spelling and gesture in the place of or in combina¬ 
tion with sign language. While Deaf people use both finger spelling and gesture, they 
are not considered to be true signs, though it is acceptable to use them in combination 
with signs. The visual modality of language permitted the use of these where attrition 
occurred. Finger spelling was used in the place of inaccessible signs. In some instances, 
it prompted recall and the finger spelling was then followed immediately by the cor¬ 
rect sign. One subject looking at a picture of an airplane extended his arms from his 
body in a manner that indicated wings. This gesture, though its intention is clear, bears 
little resemblance to one handed sign formed with the first and last finger extended and 
moving upwards. The gestures in this study often deviated from the spatial boundaries 
for sign language and were used when the sign needed was not accessible. 

5.5. non-linguistic behaviour. Non-linguistic behaviour that also provided evi¬ 
dence of attrition consisted of requests for help from the interpreters and obvious 
physical discomfort with the task. One subject hit herself on the forehead with an 
open palm, another twirled his forefinger and another repeatedly extended both 
arms, palm upwards, as if to indicate, ‘I do not know’ or ‘I can not remember’. There 
were numerous unnatural pauses and overt comments made during the tests, such as, 
‘My rsl is not very good’ and ‘I don’t remember anything in rsl’. 

6. conclusions. The hearing Russians who completed the same tests orally behaved 
more favourably, as predicted by the Model of Ethnolinguistic Vitality for their group. 
They have support for spoken and written Russian in Israel, exposure to it and ample 
opportunity to use and maintain their mother tongue. Thus, they produced their li 
relatively fluently and efficiently, illustrating substantially less evidence of attrition. 
Their average overall rate of error was 7%. The average overall rate of error for the deaf 
immigrants was 27.7%. 

No conclusive evidence was found regarding my original hypothesis that one’s 
length of residence in Israel maybe related to the amount of attrition. It appears there 
were too many confounding variables. The connection between the age of immi¬ 
gration and the amount of attrition, however, did produce somewhat more signifi¬ 
cant results, indicating that those who had immigrated as children generally suffered 
more language loss than those who had arrived as adults. But this information must 
be carefully considered for additional factors, such as whether these children had 
reached puberty, assuming it is a critical stage in language development, and how 
long they attended Israeli schools for the deaf, something the adults did not have the 
opportunity to do. 
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7. discussion. The results of this preliminary study show that a sign language suffers 
li attrition for the same reasons and in patterns similar to those of spoken language. 
Sign language research has shown that sign languages are natural languages with 
grammatical structures, rules, and patterns of acquisition similar to those of spoken 
languages. Thus, this study provides additional and novel evidence for the claim that 
sign and spoken languages function like one another when it comes to human com¬ 
munication and interaction. This documentation of attrition in sign language there¬ 
fore offers new perspectives, not only on how attrition is related to sign language, but 
on the human capacity for language as well. 
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A CORPUS STUDY ON THE (NON-)PHYSICALITY OF 
LINGUISTIC OBSERVATIONS 


Douglas W. Coleman 
University of Toledo 


this study is a continuation and extension of the author’s ongoing analysis of the nature 
of linguistic data as found in a theoretical linguistics corpus. Earlier stages in this work 
have been reported in other articles by the author (Coleman 1997,1999, and 2000). 

1. corpora. For the current study, there were two corpora. The first: Language, vol. 
67, nos. 1-4, main text of all main section articles. The second: articles from various 
journals with texts available in electronic form; all available on-line via subscription 
through the OhioLink system; the first article spanning >10 pages from the most 
recent complete volume of every twenty-fifth journal in the on-line category‘life sci¬ 
ences’ until I had 16 articles. One article was in an incompatible file format and had 
to be rejected. Corpus items are designated by journal name abbreviation + vol. + no. 
+ starting and ending page numbers, e.g. AB 59-151.10. Corpora are designated ‘TL’ 
(theoretical linguistics, those items from Language ) or ‘LS’ (‘life sciences’, those items 
from the journals in the Academic Press ‘life sciences’ category). (Information on the 
corpora is presented in Table 1, overleaf.) A list of all items in the LS corpora is found 
in Appendix I. Articles from Language appear in the References section if cited. 

The discrepancy in word count is clear, though the number of articles is compa¬ 
rable. A Wilcoxon W (Mann-Whitney U) test of median article lengths shows that 
articles in TL have significantly more words than those in LS (W=204.o, p<o.oooi). 

2. hypothesis. Patterns of lexical usage identifying ‘data as something physical 
(having objective existence) or not (existing subjectively only) will reveal significant 
differences between the TL and LS corpora. 

3. method. TL articles were (1) scanned, (2) spell-checked,(3) pre-processed to remove 
hyphenation for accurate automatic keyword-recognition, (4) concordanced for key 
words using a SNOBOL4 program written by the investigator. Then (5) keywords were 
tabulated (counted and usages categorized) and analyzed both statistically and quali¬ 
tatively. Capitalization was ignored in the identification of tokens of a given lexical 
type. Since LS articles were already in \pdf files (Adobe Portable Document Format), 
tabulation was more direct, based on use of Adobe Acrobat Reader’s find-text func¬ 
tion. Data was analyzed in a StatGraphics spreadsheet. Below, cited tokens are given 
in context, set off in bold italics; types, in contrast, are indicated in small caps; e.g., see 
the token of the type data in Figure 3 (below). 
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Corpus 

No. of Words 

No. of Articles 

Min. Size 

Max. Size 

TL 

217,450 

14 

7278 

35543 

LS 

89,025 

15 

2153 

9515 


Table 1. Corpora. 


( 44 ) 

SINGULAR 

PLURAL 

a. ‘cat’ 

kazh 

kizhier 

b. ‘squirrel’ 

kazh-koad [cat-wood] 

kizhier-koad 


Figure 1. Excerpt from [TL: L 67 - 4 ; 675 . 725 ]. 



□ LamH Id 

• Anti-laminin 
Activity 


Monoclonal Antibody 

Figure 2. Excerpt from [LS: IMG 51 - 1 ; 20 . 29 ]. (This figure has been redrawn in this 
volume for clarity.) 

The type-token distinction is critical here. By type, I refer to a generalized category 
of observations. By token, I refer to a single observable instance of a type. As I point 
out in Coleman (2000), an example such as that offered by Stump (1991) in Figure 1 
does not constitute a single observation, but rather stands for the entire class of pur¬ 
ported past observations of four grammatical forms as well as any and all potential 
future ones. As such, it is a generalized type. The example offered by Ward et al. (1991) 
in Figure 3, marked as an ungrammatical sentence, cannot constitute an observable 
entity at all. It is not only a generalized type, but one which in principle can never 
be observed at all. In contrast, a data point (e.g., one in the form V) on the graph 
from Fitzsimons et al. (2000) seen in Figure 2, represents a single observation of two 
properties (which the authors label ‘Serum Activity OD 405’ and ‘Monoclonal Anti¬ 
body’, respectively) of a given mouse in their experiment. Hence, any confusion about 
tokens vs. types (data vs. examples) involves confusion about what one is actually 
observing vs. what conclusions one is drawing from observations. 
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.. .the same sort of pragmatic influence fails to salvage the following example, 
where there is clearly no difficulty in interpreting the anaphor: 

(21) *T 11 eat oysters on occasion, but I’m not really much of a them lover. 

On the basis of such data, we can reject the view that... 


Figure 3. Excerpt from [TL: L 67 - 3 ; 439 . 474 ], 


In the Gambian example in Fig. 1, the opposite end of the spectrum is illus¬ 
trated. .. 

Figure 4. Excerpt from [LS:AB 59 -l; 1 . 10 ], 

4. data tokens (data/datum) . Authors from both corpora were equally likely to use 
a token of the type data as those from the other (W=78.o,p>o.24). They were equally 
likely to refer to ‘the data in a generic sense or to non-specific, topically-defined 
domains of data than were LS authors (W=i02.5, p>o.32). Such references seem to 
occur frequently when authors are discussing some methodological aspect to their 
work. These two factors varied across individual author, but not by corpus. 

Things typically referred to via data tokens in the TL corpus were things like 
Stump’s (1991:696) item (44), seen in Figure 2. Things typically referred to via data 
tokens in the LS corpus more closely resembled that in Figure 2, from Fitzsimons et al. 
(2000:24), their figure labelled‘Serum anti-laminin and LamH-idiotype (Id) reactiv¬ 
ity after administration of trans-gene hybridomas to histocompatible normal mice’. 

Authors in the TL corpus tend to use data and example interchangeably. Con¬ 
sider the excerpt from Ward et al. (1991:452) in Figure 3. 

A similar usage is the item already seen in Figure 1. Although Stump (1991) refers 
to it individually as an example on page 695 (‘The examples in 44 illustrate this’), it is 
clearly within the scope of what is referred to as data at the start of the article (‘All of 
the Breton data cited here...’, p. 675). Indeed, it is striking that some TL authors seem 
to use tokens of data and example in complementary distribution, as if [-specific] 
and [+specfic] variants of each other. 

In addition to the numerous tokens of data and example used coreferentially, 
there are also many where they refer to the same types of things in the articles, typi¬ 
cally numbered text items marked with asterisks, bracketing, prosodic contours, and 
so on. I presented many such examples in Coleman (2000). I will beat this (I hope) 
dead horse no further. 

Authors in the LS corpus essentially never use data to refer to the same things in 
their papers as they do when using example. There is one case that is unclear: in AB 
59 -1; 1 -1 o, the author refers to a figure containing three graphs as showing data —there 
are two tokens of data in the caption at the foot of the graphs (Figure 5, overleaf). 
But elsewhere at one point, she says that it contains an example (Mace 2000:4). No 
other such cases in which data and example are or are possibly conflated appear in 
the LS corpus. 
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Age (years) 


Figure 1 . Cross-sectional data on the life history of rural Gambians 
based on data collection by Sir Ian McGregor from Keneba and 
Manduar villages between 1950 and 1975. (a) Average annual 
weight gain as a proportion of total weight, between birth and 25 
years of age for males and females, (b) Annual mortality hazard for 
males and females over the life span, (c) Age-specific fertility 
(numbers of live births per year) for males and females over the life 
span (3-year running means). 


Figure 5. Excerpt from [AB 59 - 1 : 1 . 10 ] 
(Mace 2000:3). (This figure has been 
redrawn in this volume for clarity.) 


In Coleman (2000), I used the 
argument structure analysis scheme 
of Toulmin (1964) to analyze the place 
of what authors in TL and AL (applied 
linguistic) corpora refer to as data. TL 
authors tend to use what they refer to as 
data to provide confirming evidence’ 
(the term used by Botha (1973:35). The 
tendency among the LS authors is to 
use their data for hypothesis-testing, 
i.e., for disconfirmation of a hypothesis, 
much like those of the AL corpus ana¬ 
lyzed in Coleman (2000). 

Examples cited by TL authors are 
typically not tokens, per se. An example 
(data) such as Stump’s (1991:696) 
‘Breton data in Figure 1 is not offered 
as an event (a token) having been 
observed at a specific time and place, 
but represents any and all potential 
and/or actual occurrences of the sup¬ 
posed grammatical forms as types in 
any given utterances. Asterisked items, 
such as that given in Figure 4 (Ward et 
al. 1991:452), of course, are by definition 
never observed, so what represents data 
in this case is a supposed ‘native-speaker 
judgment’ of the item’s ungrammatical- 
ity. I say ‘supposed’, since TL authors in 
my corpus are unlikely to identify pre¬ 
cisely the observational sources of their 
data, very much less so than LS authors 
(W=39-5,p<0.02). 


To clarify: I consider the observa¬ 
tional source to be precisely identified 
only if an author states explicitly what 


real-world event has been observed and under what conditions. To say, for example, 
‘I got this data from Smith (2001)’ does not say where Smith got it. To say,‘I got this 
data item from Time, July 5,1998, p. 7; that data item from The Washington Post, Jan. 14, 
1998, p. A8’; in contrast, is a clear identification, as would be descriptions of interview 
protocols, experimental methods, questionnaires, and of the persons upon whom 
these instruments were used. 
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5. EXAMPLE TOKENS ( EXAMPLE, EXAMPLES, EXEMPLIFY, EXEMPLIFIES, ETC.). Authors 

in TL were much more likely to use example than were those in LS (W = 199.0, 
p < 0.00001). As mentioned above, example and data were frequently conflated in 
the TL corpus, but only once (possibly) in the LS corpus. The difference was signifi¬ 
cant at a high level of statistical reliability (W = 194.0, p < 0.0001). 

6. CLEAR OVERALL DIFFERENCE BETWEEN THE TWO CORPORA. Using just three of the 

variables measured in this study, it is possible to correctly predict whether a given text 
is from the TL corpus or the LS corpus at an overall accuracy above 88%. The variables 
in question are the proportion of data which is explicitly sourced (DSrcProp), the rate 
of use of example (ExRate), and the rate of usage of example to co-refer to data 
(EeqDRate). Because it is a proportion (the proportion of data tokens which indicate 
items that have been sourced), values for DSrcProp run from o to 1. ExRate and Eeq¬ 
DRate represent, respectively, (a) the overall rate of example tokens in a given text 
as a whole and (b) the rate of example tokens which refer to the same things or the 
same type of things that the author refers to via a data token. ExRate is the number 
of example tokens in an article divided by the total number of words in the article. 
EeqDRate is the number of EXAMPLE-as-data tokens in an article divided by the total 
number of words in the article. 

The medians for all three of these variables had showed significant differences 
between the two corpora: DSrcProp (LS = 0.6809, TL = 0.0000, W = 52.5, p < 
0.02), ExRate (LS = 0.0002, TL = 0.0020, W = 199.0, p < 0.00005), and EeqDRate 
(LS = 0.0000, TL = 0.0012, W = 194, p < 0.00003). 

A discriminant analysis based on the same three variables yields Wilks Lambda = 
0-467379; y 2 is used as the test of significance (y 2 = 19.3957, P = 0.0002). A classifi¬ 
cation table (Table 2, overleaf) shows that over 86% of the LS articles are correctly 
assigned to their corpus in this fashion, over 85% of the TL articles. 

Based on a suggestion made by Stephan Gries (Southern Denmark University) at 
the end of the oral version of this paper at lacus 2001,1 tried repeating the analysis 
with a randomly-selected sub-sample of the data. The first randomization yielded 
n LS =7, n rL =9. This happened to result in 87.50% of the total cases correctly classified 
(LS: 85.71%, TL: 88.89%; discriminating function p < 0.0011). The second randomiza¬ 
tion yielded a different subset also n LS =7, n TL =9. This happened to result in 100.00% of 
the total cases correctly classified (discriminating function p < 0.0001). A third ran¬ 
domization yielded n LS =8, n n =5. The smaller number of samples happened to result 
in only 84.62% of the total cases correctly classified (LS: 100.00%, TL: 60.00%; with a 
non-significant but‘suggestive’ discriminating function p < 0.0867). 

I redid the original calculation on the whole data set with a step-wise approach, 
forward and backward, both of which showed that ExRate was contributing the most 
to the accuracy of the classification of corpus items. Out of curiosity, I checked for 
a correlation between ExRate and EeqDRate and found a strong correlation, indeed 
(r = 0.938574, p < 0.0001). In other words, the higher the rate of example tokens, the 
more likely the conflation of data and examples. 
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Actual Corpus 

Group Size 

Predicted Corpus 

LS 

TL 

LS 

15 

13 (86.67%) 

2 ( 13 - 33 %) 

TL 

14 

2 (14.29%) 

12 (85.71%) 

percent of all cases correctly classified: 86.21% 


Table 2. Discriminant analysis classification table. 



Figure 6 . 3-D scatterplot. 


Figure 6 shows a scatterplot of the three variables in question (DSrcProp, ExRate, 
and EeqDRate) using the original (complete) data set. Each article in the LS corpus 
is represented by an open downward-pointing triangle (V), each article in the TL 
corpus, by a solid upward-pointing triangle ( ). Note how the LS (V) items cluster in 
the left corner, while TL articles ( ) follow a sort of backwards ‘J’ toward the opposite 
corner. Two of the TL articles, in particular, are extremely difficult to differentiate from 
the LS cluster—they are MacLaury (1991) [DSrcProp = 0.79] and Boysson-Bardies & 
Vihman (1991) [DSrcProp = 1.0]. MacLaury’s was a study on color semantics using 
a variety of interview-based elicitation techniques (p. 37) with a number of human 
subjects. Boysson-Bardies & Vihman used audio- and video-recorded material from 
20 infants; they analyzed their results statistically. 


7. conclusions. What does all this suggest for linguistics as a field of study? The LS 
corpus is very consistent in regard to the identification of data as something observ¬ 
able in terms of objects and events that exist apart from the observer, not strictly 
‘objects’ created by the viewpoint of the observer (and existing only in that observer’s 
subjective experience)—to paraphrase Saussure (1959:8) on the‘objects of language’. 
As a whole, the TL corpus is very unlike the LS corpus in this regard. 

The LS corpus contains at most one case in which references to data (concrete 
observation tokens) and an example (a type, and therefore an abstraction) are con¬ 
flated. Conflation of data and example, of concrete token and abstract type, is more 
the rule than the exception in the TL corpus. 
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However, even in the TL corpus this conflation is not universal; obviously, it does 
not have to be. Nor does linguistic data need to consist of things that no author in the 
LS corpus (or some other scientific corpus) would label as such. 

Finally, I must emphasize that what I have been discussing are not simply issues 
of terminology. They reflect underlying practices and approaches. It can be no coinci¬ 
dence that the two articles (MacLaury 1991 and Boysson-Bardies & Vihman 1991) in 
the TL corpus which appear closest to the LS cluster in Figure 7 are also two articles 
which exhibit perhaps more clearly noticeable characteristics of‘social science’meth¬ 
odologies and genres than the majority of the other articles in the TL corpus. 
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LINGUISTICS AS AN EMPIRICAL SCIENCE: THE STATUS OF 
GRAMMATICALLY JUDGMENTS IN LINGUISTIC THEORY 


Patrick J. Duffley 
Laval University 


‘anyone who has taught an introductory syntax course has had the experience of 
presenting an “ungrammatical” example only to be told by some smart-aleck about 
an unsuspected interpretation on which the sentence is quite normal’ (McCawley 
1982:78). A recent discussion with one of my colleagues, in which I had the pleasure 
of playing the part of the smart-aleck, led me to some serious reflection about the 
question of grammaticality judgments and their role in linguistic methodology. 
The conclusions of these reflections will be presented in this paper. My colleague and 
I were examining sentences (i)a and (i)b below, the first of which was claimed to be 
acceptable and the second not: 

(1) a. What did John hurt himself fixing? 
b. *What did John hurt Bill fixing? 

Although I could sense that there was some sort of difference in ease of interpreta¬ 
tion between these two sentences in favour of (i)a, I was not at all happy with the 
suggestion that (i)b was ‘less acceptable’ or ‘less grammatical’ than (i)a. So I asked 
my colleague in what sort of context someone would say (i)a, and the description 
ran more or less as follows: ‘the speaker knows that John was fixing things around 
the house yesterday and that he hurt himself while fixing something, but he does not 
know what that thing was and would like to be informed thereof’. This led to the logi¬ 
cal rejoinder that if the speaker knew that John was fixing things around the house 
with Bill yesterday and that he hurt Bill while fixing some object whose identity the 
speaker was ignorant of but wished to know, then (i)b is a perfectly acceptable Eng¬ 
lish sentence for obtaining that information. 

The sort of discussion referred to above, which is typical in linguistic circles, leads 
one to wonder just what people (linguists included) are doing when they make so- 
called grammaticality judgments. It also raises the more basic question of why some 
linguists use such judgments as their prime source of data. What are they hoping 
to prove? Can grammaticality judgments provide the sort of evidence that these lin¬ 
guists are looking for? 

It is not at all evident why an account of speakers’ competence in understand¬ 
ing and producing language should be based on behaviour in a situation where they 
are doing neither, but rather are being asked to report their intuitions about the 
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acceptability of certain sequences of words. The motivations for following such a 
procedure can be boiled down to four main reasons: 

First, by eliciting judgments we can examine reactions to sentence types that 
might occur only very rarely in spontaneous speech or recorded corpora. This is 
a standard reason for performing experiments in social science—observational 
study does not always provide a high enough concentration of the phenomena 
we are most interested in. A second, related reason for using grammaticality 
judgments is to obtain a form of information that scarcely exists within normal 
language use at all—namely, negative information, in the form of strings that 
are not part of the language. The third reason for using judgments is that when 
one is merely observing speech it is difficult to distinguish reliably slips, unfin¬ 
ished utterances, and so forth, from grammatical production. A fourth and more 
controversial reason is to minimize the extent to which the communicative and 
representational functions of language obscure our insight into its mental nature. 
(Schiitze 1996:2-3) 

With respect to the last reason, Schiitze admits that it ‘presupposes a particular view 
of grammatical competence as cognitively separate from other facets of language 
knowledge and use, and hence its validity depends on one’s theoretical stance on 
the issue’. It will be my contention in this paper that all of the other reasons except the 
first are also theory-dependent and that on top of this grammaticality judgments are 
practically worthless as scientific evidence, even if one accepts the theoretical presup¬ 
positions of generative grammar. 

To start with the last point, it must be realized that the term ‘grammaticality judg¬ 
ment’ itself is in fact a misnomer. What is actually meant would be better expressed by 
the term‘acceptability judgment’ (cf. Schiitze 1996:26). Given Chomsky’s (1965:10-11) 
definition of grammaticality as belonging to the sphere of competence, that is, of 
the ideal speaker’s knowledge of his language, it makes no sense to speak of‘gram¬ 
maticality judgments’, since grammaticality is not accessible to people’s intuitions: all 
a native speaker can do is judge a string’s acceptability. So what people are actually 
doing when reacting to a sequence of words presented to them by a linguist is judg¬ 
ing whether it seems acceptable to them or not. Is there any relation, then, between 
acceptability and grammaticality which would allow inferences to be made about the 
latter based on the former? 

According to generative theory there is indeed a relation between these two con¬ 
cepts, whose nature is described in the following well-known passages: 

Acceptability is a concept that belongs to the study of performance, whereas 
grammaticalness belongs to the study of competence... Grammaticalness is only 
one of the many factors that interact to determine acceptability. Correspond¬ 
ingly, although one might propose various operational tests for acceptability, it is 
unlikely that a necessary and sufficient operational criterion might be invented 
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for the much more abstract and far more important notion of grammaticalness. 
(Chomsky 1965:10-11) 

.. .linguistics as a discipline is characterized by attention to certain kinds of 
evidence that are, for the moment, readily accessible and informative: largely, the 
judgments of native speakers. Each such judgment is, in fact, the result of an 
experiment, one that is poorly designed but rich in the evidence it provides. In 
practice, we tend to operate on the assumption, or pretense, that these informant 
judgments give us ‘direct evidence’ as to the structure of the I-language, but, of 
course, this is only a tentative and inexact working hypothesis... In general, infor¬ 
mant judgments do not reflect the structure of the language directly; judgments 
of acceptability, for example, may fail to provide direct evidence as to grammati¬ 
cal status because of the intrusion of numerous other factors. (Chomsky 1986:36) 

These observations show that Chomsky himself is aware of the indirectness of the 
link between acceptability (performance) and grammaticality (competence), as indi¬ 
cated by the reference to the ‘numerous other factors’ that interact with the grammar 
to produce acceptability. Just how numerous and uncontrollable these factors are is 
shown by the studies of Birdsong 1989 and Schiitze 1996. They comprise things like: 

(a) the instructions given to the subjects doing the questionnaire 

(b) the order of presentation of the sentences submitted for speaker reactions 
(the first sentences in a questionnaire tend to be judged much more severely 
than the others; cf. Greenbaum 1973,1976) 

(c) the effect of the repetition of an unacceptable structure leading people to 
accept it 

(d) judgment strategies (one does not know whether the subjects are using the 
same criterion to decide on acceptability) 

(e) modality and register (a written questionnaire already represents a fairly 
formal context for most speakers; cf. Greenbaum 1977) 

(f) how much time is given to the informants to react 

(g) context (is it easy to imagine a possible context for the sentences?) 

(h) meaningfulness (can people make sense of the sentences?) 

(i) length and complexity of sentence judged 

(j) frequency of constructions (less frequent structures are often judged unac¬ 
ceptable) 

(k) lexical content of items 

(l) rhetorical structure (structural parallelism renders certain sequences accept¬ 
able which are not perceived so otherwise; cf. Langendoen 1972). 

I would add to this long list an even more basic and important factor: the very fact 
of asking a speaker to make an acceptability judgment is asking him to do some¬ 
thing completely unnatural. What people rather are accustomed to doing with their 
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language is not making grammaticality judgments, but simply using it to express 
themselves. Add to all this the fact that one is not supposed to know what the gram¬ 
mar looks like—it is what the linguist is trying to determine—and you get a bag so 
mixed that its contents are impossible to sort out. 

Why, then, have recourse to data whose connection with the object of one’s 
hypothesis is so tenuous? This question takes us back to the motivations given in 
defense of this procedure—rarity of significant data in spontaneous speech, the need 
to obtain negative data, the facilitation of distinguishing performance errors from 
grammatical production, the separating out of the communicative and represen¬ 
tational functions of language from its mental structure. When scrutinized more 
closely, all of these reasons except the first turn out to be products of the theoretical 
stance adopted by generative grammarians. I have already alluded to Schiitze’s admis¬ 
sion that this is the case for the last reason. Regarding the facilitation of the distinc¬ 
tion between performance errors and grammatical production, it should be fairly 
obvious that this alleged justification merely begs the question by presupposing some 
concept of grammatical production’. Moreover, as shown by the enumeration given 
above of the many possible performance errors which can occur in the making of 
grammaticality judgments, the use of such judgments does not facilitate the identi¬ 
fication of potentially extralinguistic factors which have an impact on the data but 
merely adds further factors to the list. This has led Schiitze (1996:179-180) to observe 
that: ‘In fact, it might appear that grammaticality judgments are the worst way to get at 
linguistic competence, as compared to production and comprehension, because they 
involve the interaction of many more factors’. 

Under Schiitze’s pen, this is a merely rhetorical objection. He goes on to give two 
reasons why this does not constitute grounds for abandoning grammaticality judg¬ 
ments as a source of data: 

1. while more factors are involved in such judgments, they ‘might be less mys¬ 
terious than those connected to language use’ (how could we ever define the 
‘understanding of a sentence’ or ‘communicative intentions’ and how could 
we draw conclusions about grammaticality from them?); 

2. grammaticality judgments provide an alternative path to the grammar (they 
are subject to different influences than language use is and so facilitate the 
search for the common core that underlies both types of behaviour, i.e., 
the grammar). 

Neither of these reasons stands the test however. Concerning the first motive, one of 
the crucial factors impacting on grammaticality judgments is necessarily whether the 
subject can understand the sentence or not (cf. Schiitze 1996:162), a fact which would 
seem to make natural language comprehension no more ‘mysterious’ than such judg¬ 
ments. Moreover, if one compares comprehension and grammaticality judgments 
in Schiitze’s own model (p. 175), one notes that both are determined by the same 
four factors—input, knowledge (general, contextual, etc), competence and parsing 
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strategies—with four other factors being added for grammaticality judgments. This 
only makes sense—in order to judge a sentence’s acceptability, one must first com¬ 
prehend it—but the facts certainly do not support the suggestion that the process of 
judging a sentence for grammaticality is any less obscure than that of comprehending 
the sentence in a natural context—quite the contrary As for the argument that gram¬ 
maticality judgments provide an alternative path to the grammar, it is much sounder 
methodology to begin with the cases where it appears probable that the fewest fac¬ 
tors are involved before attempting to come to grips with the more complex cases. 
As Birdsong (1989:72) puts it, ‘the hypocrisy of rejecting linguistic performance 
data as too noisy to study, while embracing metalinguistic performance data as 
proper input to theory, should be apparent to any thoughtful linguist’. 

Schiitze’s second reason—the need to obtain negative data—brings us even closer 
to our objective of understanding why linguists of the generative school have such 
regular recourse to grammaticality judgments. The very fact of needing to discrimi¬ 
nate between certain sequences of words that are ‘part of the language’ and other 
which are not implies a certain view of both grammar and language which is peculiar 
to generative theory. The citation below provides a capsule summary of this view: 

A major objective of linguistic research is to construct a grammar capable of gen¬ 
erating all the grammatical sentences and no ungrammatical ones. This research 
involves identifying the rules that allow speakers to determine which sentences 
of their language are well-formed and which are not. (O’Grady & Dobrovolsky 
1987:103) 

Particularly revealing in this quotation is the close relation made between the project 
of constructing a generative grammar and the search for the rules that allow speak¬ 
ers to judge which sentences are well-formed and which are not. This suggests that a 
transfer has taken place from the role the grammar is claimed to have in the theory to 
the role of the subject in a grammaticality judgment: just as the grammar determines 
what is well-formed and what is not, so the speaker confronted with a string of words 
in a questionnaire decides what is structurally good and what is not. However this is 
definitely not what people do when they comprehend what others are trying to say 
in a normal speech situation (nor is it, as we have seen above, what they are doing 
when they make grammaticality judgments). Such a view of grammar makes it an 
algorithm for performing structural ‘grammaticality’ choices rather than an instru¬ 
ment for carrying on communication. 

Examples of this procedure abound; to give a typical case, one might refer to 
Givon’s studies on causative verbs in two articles entitled ‘Cause and Control: On 
the Semantics of Interpersonal Manipulation (1975) and‘The Binding Hierarchy and 
the Typology of Complements’ (1980). In his discussion of the verbs cause, make 
and have, Givon claims that these English verbs may be scaled according to two 
semantic properties which are universally attested: (a) intended (‘controlled’) vs. 
unintended (‘uncontrolled’) causation; (b) ‘mediated vs. direct causation (1980:335). 
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Cause is a ‘noncontrol causation verb’, make a ‘direct control causation verb’ and 
have a ‘mediated control causation verb’. When the data supporting these ‘generaliza¬ 
tions’ is confronted with actual usage however, the suspicion arises that the data was 
fabricated to support the universal semantic properties rather than the latter being 
inferred from an observation of usage. For instance, the claim that make denotes 
deliberately intended causation while cause evokes accidental causation is based on 
the purported contrast in acceptability between (2)a and (2)b: 

(2) a. John accidentally/inadvertently caused Mary to drop her books, 
b. *John accidentally/inadvertently made Mary drop her books. 

Actual usage shows, however, that the distinction between cause and make has noth¬ 
ing to do with intentionality. On the one hand, cause can denote a deliberate action, 
as in (3): 

(3) If a person has thoughtlessly or deliberately caused us pain or hardship... 
(Brown U. Corpus bo8 0470) 

On the other hand, make can evoke unintentional causation, as in: 

(4) Other women—they only made me love you more. 

(O’Neill 1955 [vol. i]:i3o) 

The analysis of have as denoting mediated causation, which is intentional but requires 
the intervention of a third party, suffers from a similar lack of support from the 
empirical data. Givon adduces the purported contrast in acceptability between (5)a 
and (5)b/c: 

(5) a. I had her lose her temper by sending John to taunt her. 

b. ? I caused her to lose her temper by sending John to taunt her. 

c. ? I made her lose her temper by sending John to taunt her. 

Actual usage in this case would seem, however, to be exactly the opposite of Givon’s 
judgements: (s)a makes no sense at all, while (5)b and (5)c are quite normal. Have is 
used in English to evoke getting someone to do something by exercising one’s author¬ 
ity or control over them through a request or command, as in (6) below: 

(6) The teacher had me recite my poem in front of the class. 

This does imply intentionality on the part of the causer and compliance on the part 
of the causee, but there is no idea at all of mediation by a third party suggested by the 
meaning of the construction illustrated in the sentence above. 
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The conclusion we have been led to, therefore, is that grammaticality judgments 
are not a reliable source of empirical data. What speakers are doing when they per¬ 
form such judgments is appraising the acceptability of the utterances they are being 
asked to evaluate. Even if one accepts the hypothesis of a separate grammatical 
module constituting one of the important factors which determine acceptability, the 
causal link between acceptability and grammaticality does not allow one to make 
inferences from one to the other. If natural language production (i.e., performance) 
is viewed as inadequate data for inferring conclusions about grammaticality, the data 
provided by grammaticality judgments must be considered even less trustworthy. As 
a type of metalinguistic behaviour, these judgments are themselves just another sort 
of performance, and as such they are subject to even more confounding factors than 
natural utterance production. 

One might palliate some of the drawbacks of grammaticality judgment data by the 
design of the questionnaire used to elicit them. As shown by Cries (this volume), 
the inclusion of experimental controls such as randomized presentation of sentences, 
inclusion of fillers, and clear exemplification and explanation of the required judg¬ 
ment process, can elicit reactions which correspond fairly closely to corpus data. 
Moreover, if sentences were presented to informants with a context, that is, a descrip¬ 
tion of the communicative situation, then, in this more natural setting, one should 
obtain more reliable judgments of acceptability. 

The fact remains, nevertheless, that sound methodology would advise one to first 
study language in its natural setting before placing speakers in an artificial situation 
and asking them to do something entirely different from everyday language use. The 
very nature of a questionnaire suggests a testing of the informants’ ability to con¬ 
form to some norm of expected behaviour, and triggers the reaction ‘what should 
one say in this situation?’. Even if one were to succeed in eliminating this condi¬ 
tioned reflex—something highly unlikely in the present author’s opinion—there still 
remains a hypothetical element inherent in the nature of a questionnaire: informants 
are being asked to answer the query ‘what would one say in this situation?’ Neither 
question corresponds necessarily to what the speaker actually says in a given situ¬ 
ation. Thus, in any case, one is driven back to actual usage as the final test of the 
explanatory capacity of any theory. Isn’t what people actually say what we linguists are 
supposed to be explaining in the first place? 
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how helpful and accurate is the lexicon of science? Scientists and teachers 
of science since Lavoisier two hundred years ago have stated that knowing science 
means knowing the language for its concepts. Igor Mel’chuk (2001) has discussed the 
disarray of terms in European science in the last century The British science jour¬ 
nal Nature recently showed the concern of science with its lexicons by beginning a 
weekly column entitled ‘Words’, reporting the first use of the word scientist in 1834 
(Danielson). Recently, it reported that the Sudbury Neutrino Observatory in Ontario 
combined its new measurements with some from the Super-Kamiokande detector in 
Japan to show that neutrino particles have unexpected mass and sometimes switch 
their identities. When their identities change, should their names change? An editorial 
called the finding ‘not a crisis for existing models, but a route to deeper ones’ (Nature 
2001a). It welcomed the implications for the theory called the ‘standard model’, GUT 
(‘Grand Unified Theory’), TOE (‘Theory of Everything’), and SUSY (supersymmetry), 
as well as the yet-unseen Higgs boson 1 . 

If the lay public theorizes that science is difficult only because of its vocabulary, 
then the reasons for a difficult lexicon ought to be examined. What evidence of dif¬ 
ficulty appears in the morphology of terms in science in English, the leading language 
of science now? How helpful is the morphology of the lexicons of science? How 
useful are the many glossaries? Since sciences have different vocabularies, I examined 
separately the lexicons of two different fields that are currently making great strides. I 
report comparative tabulations of the morphological patterns of terms in the fields of 
particle physics and neuroscience. I give examples and conclude with reasons for dif¬ 
ficulties, based on the ways scientists must work. All the terms I discuss are standard 
ones used in publications, not vernacular for informal conversation in the lab. 

Particle physics deals with extremely rare tiny particles, too small for instruments 
to detect, usually predicted only by logical gaps in paradigms and identified only by 
their effects. They can appear to be in two places at once. The Heisenberg Uncertainty 
Principle holds that not everything, such as location and velocity, can be known simul¬ 
taneously. ‘Spooky’ is how Einstein described one of his own thought experiments 2 . 
Astrophysicist John Gribbin is dissatisfied with the terms particles and waves, which he 
calls metaphors. ‘We call those objects particles, for want of a better name. What they 
really are, we do not know’ (199813:51-52). The Nobel Prize-winning physicist Steven 
Weinberg also expresses concern about names for the materials he studied: 
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.. .quantum mechanics has transformed the very language we use to describe 
nature: in place of particles with definite positions and velocities, we have learned 
to speak of waves and probabilities. Out of the fusion of relativity with quantum 
mechanics there has evolved a new view of the world, one in which matter has 
lost its central role. This role has been usurped by principles of symmetry, some 
of them hidden from view in the present state of the universe. On this foundation 
we have built a successful theory of electromagnetism and the weak and strong 
nuclear interactions of elementary particles. Often we have felt as did Siegfried 
after he tasted the dragon’s blood, when he found to his surprise that he could 
understand the language of birds. (Weinberg 1992:3-4). 

Notice how easily the physicist moves from science to mythology. This movement 
contradicts the stereotype, but it is not unusual for creative scientists. They use their 
imaginations, enjoy their work, and welcome new information that may make them 
revise their imperfect theories. 

Neuroscience did not exist in 1848 when a large metal bar was blasted straight 
through the brain of Phineas Gage. He lived and provided the basic case study for a 
new field. Now neuroscientists are exploring electrical impulses between the billions 
of tiny neurons in our brains and prompting biological discussions of language use, 
including puns and slips of the tongue (e.g. Lamb 1998). 

materials and methods. To explore the lexicons of these two sciences, I tabulated 
the frequency of acronyms, eponyms, nominalizations, learned vocabulary, special¬ 
ized uses of common words, translations, and other types of etymology. I examined 
all the words that appeared in both the index of The Brain (Time-Life Books 1990) 
and The Oxford Companion to The Mind (Gregory 1987). To match those words in 
number, I selected the terms on every third page of Q is for Quantum: Particle Physics 
from A to Z (Gribbin 1998a). These glossaries have extensive explanations to enable 
uninitiated lay readers to comprehend writing for the initiated. I also consulted the 
American Heritage Dictionary (AHD 2000) and the Oxford English Dictionary (OED 
1999, cd-rom 2.0). The basic 415 terms were not adequate guides to the vocabulary 
of articles labeled for these fields in current issues of the journals Science and Nature; 
I combine examples of the missing newer terms with the items in the glossary when 
the newer terms illustrate a problem better. 

Comparisons of tabulations show that the two fields are obscure in different ways. 
Reorganizing the tabulations according to the motivation for the vocabulary choices 
shows that the underlying motivations are similar in the two fields, although not 
always proportionate. The following are the resulting seven reasons for difficulty. 

1. research methods utilize Greek and latin roots. Differing research methods 
and materials result in disproportionate use of Latin and Greek roots. Particle phys¬ 
ics attempts description, despite the invisibility of the content. For example, hadrons, 
from Greek for ‘thick,’ are so small that more than a hundred may be needed for an 
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atom. However, they have much more mass or are ‘thicker’ than leptons, elementary 
subatomic particles named from Gk leptos, ‘fine, thin,’ from lepein, ‘to peel,’ not to be 
confused with leptin and Lep, material which affects neuronal activity (Cowley 2001: 
411), or with LEP, the acronym for the Large Electron Positron in Europe. 

Neuroscience used nearly twice as many Latin and Greek sources as particle phys¬ 
ics (97 v. 55). These roots were almost five times as likely to refer to the location, shape, 
or appearance of what was named (34 v. 7). Examples include cortex (from Latin ‘bark’ 
for the textured outer layer of gray matter), lateral geniculate nucleus (from Latin 
‘side, and ‘bent like a knee,’ and Tittle nut’ for the visual center), and amygdala (from 
Latin from Greek ‘almond shaped’ for a certain mass of gray matter at the back of 
the brain). These names reflect the extensive use of Latin roots in traditional medical 
terminology. 

Descriptive names identify but do not explain purpose. They reflect how neuro¬ 
scientists identify a physical structure before understanding what it does. To explore 
what is happening in to the brain of a living patient, neuroscientists may inject a 
radioactive dye and observe its path. They diagnose epilepsy by inserting electrodes 
into the skull for electroencephalagrams (EEGs) that show patterns of frequencies in 
these ‘brain wave tests’ labeled by the Greek letters alpha, beta, delta, and theta. They 
apply the descriptive information contained in some of their morphemes to deter¬ 
mine the specific locations to place electrodes; they use their knowledge of a physical 
structure to determine how it works and what its problems are. 

On the other hand, particle physics deals with invisible content known through its 
reasoned effects. These researchers question the causes of effects, reason logically to 
predict a possible explanation, and then plot a way to show the effect of the proposed 
explanation, perhaps with large expensive cyclotrons that record flashes of light when 
rare particles collide. Logical reasoning is a major tool, for their observations are even 
less direct than brain wave tests. 

2. overlapping fields share or divide terms. Scientists need research tools and 
terms from other fields, despite problems. Often words have different meanings in dif¬ 
ferent fields. A British billion is a thousand times more than an American billion. The 
American Heritage Dictionary lists two general definitions for nucleus and eight others 
labeled for specific sciences: biology, botany, anatomy, physics, chemistry, astronomy, 
meteorology, and linguistics. Neuroscience needs biology, chemistry, electricity, physi¬ 
ology, and physics to explain synapse (from Greek ‘join together’), wherein neurons 
secrete a neurotransmitter like dopamine to stimulate electrical responses in another 
neuron or cell. 

Both neuroscience and particle physics involve terms from electricity. The affix 
-ion has a minor relevance in discussing the brain because in 1875 Broca wanted to 
use the end of inion as a suffix to name parts of the skull (e.g. gnathion and gonion). 
He knew that inion was Greek for ‘the nape of the neck’ and had been used in English 
since 1811 (OED). His choice lost out to occiput from Latin roots for‘against’ and‘head,’ 
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and now the occipital lobe at the back of the brain is the visual center. Both choices 
describe locations. 

Michael Faraday had already utilized ion in 1834 when he was creating the electric 
motor and generator and needed to refer to a charged group of atoms. His source 
was Greek ion for ‘something that goes,’ from ‘to go,’ because an electric charge causes 
movement as ions travel from one electric pole to another. In 1891 -ion became a suffix 
added to the electr- root for the magnitude of charge of each bond in the atom 3 , and 
in 1902 ion became the name for the stable elementary particle that is a constituent 
of all atoms. After -ion became a suffix in electron, proton, and neutron, it was used in 
naming many other particles such as meson, muon, pion, lepton, and fermion. 

Particle physics formed nouns from adjectives or other nouns more than twice 
as often as neuroscience (24 v. 10). A conflicting choice of forms offends the physi¬ 
cist who complained that although gravity waves have a particular meaning in fluid 
dynamics, particle physicists had usurped that term rather than use what he called 
‘the more proper gravitational waves’ (Visser 2001:834). 

3. advancements may imply changes. New developments require new vocabulary. 
Some fields are changing so fast that anyone who does not read the latest reports is 
lost. Journal writers quickly assume that their initiated readers understand new terms. 
Furthermore, while some names of particles change when they become better under¬ 
stood, as when mu mesons became muons, other names, such as atomic bomb, persist 
despite changes that contradict their roots. High school chemistry teaches that matter 
is made of molecules, which are made of atoms, which meant ‘indivisible’ in Greek in 
accordance with the ancient understanding that is now contradicted in what is called 
‘the Standard Model’. A newer refinement is superposition, being in two states at once, 
a concept which is difficult for some minds to accept. 

4. two-word terms represent complexity. Terms of more than one word fre¬ 
quently include the names of scientists, but understanding their significance requires 
knowledge of the history of the field. Particle physics was more than three times as 
likely to name a scientist involved (27 v 8): Cerenkov counter, Higgs boson, Fermi-Dirac 
statistics, Fermilab, fermion, and fermium. (In eponyms, capitalization is irregular.) 
Although Fermi himself was honored by having an element named for him, it was 
not the one that he discovered. 

Many other terms consist of more than a single word. Neuroscience terms averaged 
1.5 words, and terms in particle physics averaged 1.1. The latter, however, consisted of 
two or more nouns in sequence more than three times as often as did neuroscience 
terms (38 v 11, or 28% v 8%). Examples include Down’s syndrome and iconic memory in 
neuroscience and angular momentum and coupling constant in particle physics. 

Two word terms, which predominate, present special problems in both science and 
general usage, because there are many ways in which the words relate to one another. 
Is a coupling constant a constant that couples with something or the constant of inter¬ 
action between couples. Is iconic memory the memory of icons or memory that works 
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by creating icons? It is not deducible that a paper clip holds paper, while a paper plate 
does not, and now is probably made of plastic. Chaos theory and fuzzy logic sound silly, 
but the terms have become so well known that general usage describes inaccurate or 
manipulative figures as fuzzy math. However, science needs terms to refer to what it 
deals with, and it must make calculations. Chaos theory deals with the movement of a 
particular particle that is so random that it cannot be predicted although the majority 
of the motion is clear; deterministic rules can result in disproportionate effects from 
a small change in starting conditions. Scientists need appropriate vocabulary for rea¬ 
soning about random features to search for a pattern. 

5. ACRONYMS AND NOMINALIZATIONS CONTRIBUTE TO BREVITY. Although acronyms 

were common in current journals, only occasionally did the glossary sample reduce 
terms to initials or pronounceable acronyms (6 and 4). A neuroscience example 
is dopamine, a chemical secreted by neurons and involved in memory, emotions, 
schizophrenia, and Parkinson’s disease. D+O+P+A is an acronym for dihydroxyphe- 
nylalanine plus amine ‘compound derived from ammonia. It has nothing to do with 
dope from Dutch for ‘sauce’ (Gregory 1987: 519). Brain activity is studied with elec¬ 
troencephalograms (EEGs), magnetic resonance imagery (MRI), and positron emission 
tomography (PET). The Stanford Linear Accelerator Center is a research center known 
by the acronym SLAC, and the Sudbury Neutrino Observatory is referred to with the 
acronym SNO, pronounced as ‘snow.’ An anticipated theory that gets a great deal 
of discussion is called supersymmetry and is abbreviated to SUSY. It involves many 
dimensions and may reveal something about the relation between force and matter. 

The clever brevity of terms in quantum physics is economical but not always clear. 
It seems to be a current contagious trend, used frequently for neurotransmitters now 
being discovered. Recent journals report many materials known by acronyms, such as 
MuSK (muscle-specific kinase, Tin 2001), and SLAM (signaling lymphocyte activation 
molecule). SLAM is incorporated in another acronym, SAP for ‘SLA M-associated 
protein’; it is also used as a verb in a headline,‘STAMing T-cell differentiation ( Nature 
2001b). Clever acronyms and shortened forms abound. A material that inhibits neu¬ 
rons in synapse is called Nogo. A midline repellant is called Slit, a ‘secreted ligand for 
the transmembrane protein roundabout’, which is known as Robo. 

These terms are too new for the standard glossaries consulted. Acronyms and 
other shortened forms of names are taught in technical writing courses as COIK, 
‘clear only if known.’ 

Another method of shortening the presentation of information utilizes nomi- 
nalizations, nouns made from verbs, adjectives, or other nouns (examples from the 
acronyms and initialisms in the previous paragraphs include emission, resonance, 
accelerator, and observatory from verbs; center and symmetry relate to adjectives, and 
imagery comes from another noun related to a more common verb). Nominalizations 
of verbs condense information but omit the actor, time, and probability of the action. 
This condensation is efficient for those knowledgeable in the field but decreases read¬ 
ability tremendously for the uninitiated. An arbitrarily selected sample sentence from 
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a neuroscience article in Nature is at the extreme end of difficulty in readability sta¬ 
tistics (Fournier 2001: 341). 

Immunohistological staining of chick embryonic spinal-cord cultures localizes 

the protein to axons, consistent with mediation of axon-outgrowth inhibition 

induced by Nogo-66 (Fig.sd) 4 . 

The syntax seems simple (subject, one-word verb, direct object, no subordinate 
clauses). It has only one article but seven other modifiers, and five prepositional 
phrases. The lexical density is very high: 73% of the words carry content, including the 
27% of the words that are nominalizations, one of which is a hyphenated two-word 
modifier. An earlier more extensive analysis compared parallel reports on neurosci¬ 
ence research in Science and found the same patterns of heavy use of nominalizations 
and prepositional phrases (Hartnett 2001). Specialists got shorter sentences in a more 
nominal style with tremendously more nominalizations and prepositional phrases 
than uninitiated readers saw in parallel articles reporting the news and significance 
of the same research. Reports for specialists in the field had simpler syntax but greater 
lexical density. 

6. SPECIALIZED MEANINGS AND SYSTEMS HELP ONLY THE INITIATED. Ill neuroscience 

sympathetic has a specialized meaning 5 , but particle physics narrowed the meaning 
of common words three times as often (53 v. 16). Examples include element, field, 
fixed, spin, and string theory. Fixed means set and unchangeable, and spin 6 is simpler 
than chirality, the left or right-handed orientation, from Greek for ‘hand.’ String 
theory is an untestable hypothesis involving single-dimension particles in a world 
of many more dimensions. 

Compared to the vocabulary of neuroscience, particle physics had five times as 
many ordinary words that could mislead careful readers by narrowing, broadening, or 
otherwise changing the meanings of (11 v. 2). An atom bomb involves a nuclear reac¬ 
tion; a black body is a hypothetical perfect absorber of all incident radiation but is not 
black when heated; a deuteron is a positively charged subatomic particle which has 
only one neutron and one proton, although deu adds them together as two; and the 
Manhattan project was deliberately named to mislead for secrecy in developing 
the nuclear bomb. In German writing by Einstein and Planck in 1904, a quantum 
was the smallest unit of energy, from Latin for ‘how great.’ It is great only in compari¬ 
son, but common usage today implies something significantly large. 

7. whimsy is creative. Particle physics had a wide range of types of terms. In the 
original statistics from the glossaries, it used 50% more learned terms than neurosci¬ 
ence did (49 v. 32), such as path integrals, perturbation and resonance, often with nar¬ 
rowed meanings. Slightly more of its terms (72 v. 60) were somewhat decipherable, 
such as gluon, accelerator, and absolute zero. It used a few metaphors, such as excited 
state 7 in addition to some whimsical terms like quark. Quarks are constituents of 
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hadrons and are composed of elementary particles bound together appropriately by 
gluons, which have neither mass nor charge. Metaphor and whimsy were absent from 
the original neuroscience glossaries. 

Stories abound showing that the imaginations of the researchers are the best 
source credited for some names. The technical but whimsical name for some elemen¬ 
tary particles is quarks. Murray Gell-Mann chose this term from a line in Joyce’s 
novel, Finnegans Wake, ‘Three quarks for Muster Mark!’ In a scurrilous poem, a 
legendary king got complaints or birds’ squawks (quarks as a standard verb, AHD). 
Gell-Mann altered the pronunciation to resemble quart as in a call at a pub or a bar 
for three quarts. The number three was relevant because only three such subatomic 
particles were known then; now there are six. These subparticles were called aces (by 
Zweig [Gribben 1998a: 158]) or partons (by Feynman), but those names did not stick. 
The six types of quarks are distinguished by a property called flavor. The flavors are 
down, up, charm, strange, bottom, and then top, which was predicted by theory but not 
discovered until 1995. The strange quark was named because it behaved in an unusual, 
unanticipated way. 

Calling the types of quark flavors has no more basis than calling the study of their 
strong interactions quantum chromodynamics. Greek ckromo means color. Certain 
types of forces of particles are labeled for colors as a method of classifying them, yet 
they have no color. The term quantum chromodynamics is abbreviated to QCD and 
parallels quantum electrodynamics, which is abbreviated to QED, which suggests to 
academic minds the Latin abbreviation, q.e.d., quod erat demonstrandum, ‘which 
was to be demonstrated’, meaning it is already proven so that further discussion is 
unnecessary. On another occasion, when Gell-Mann realized how particles could fit 
together in groups of eight, he called his classification scheme the ‘Eightfold Way’, 
irrelevantly and irreverently suggesting the Buddha’s eight-step plan for righteous 
living. Such names persist, despite competing names and recognized problems. 

conclusions about the language of science. Morphology has its reasons, but 
they are not sufficient to simplify or even explain science, especially particle physics, 
which seems least conventional. The lexicons of science are difficult not because of 
their morphology but because of the nature of scientific research. It is always unset¬ 
tled and incomplete. The great physicist who reformulated quantum mechanics Rich¬ 
ard Feynman said that objective evaluation requires not knowing the answer, and the 
purpose of knowledge is to appreciate wonders more (1999:102-03). New knowledge 
can contradict what is already accepted. Just as we say the sun rises and sets without 
believing that it really does, scientists continue to use terms whose etymology no 
longer reflects the current understandings. As understanding develops, only some 
terms and systems of terms change. There is no rule. 

Neither are there rules of how to form a name in many fields. As a result, many 
fields have had naming problems and are troubled by them now. Chemistry has more 
systematized names than most fields because it recognized the problem early and 
formed organizations in the nineteenth century to systematize nomenclature. Its 
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suffixes express very specific identities. Some outmoded terms, however, are too well 
established to be uprooted. Some biologists now want to scrap the system Linneaus 
described in 1758 because of the ‘current understanding of evolution and biodiversity’ 
(Pennisi 2001). Genetics also is having a serious problem with multiple names for 
the same gene or process and a single name applied to a dozen different ones (Niklas 
2001). A Gene Oncology consortium (with the acronym GO) was formed to control 
the nomenclature (Pearson 2001). However, it must limit its work to names of func¬ 
tions, because it was already too late to eliminate the many problems with established 
names of genes or to stop the fun that geneticists have with the trend toward whimsi¬ 
cal acronyms like Shh for the Sonic hedgehog, built on hedgehog, a protein mutation 
that participates in signaling. They named a genetically modified rhesus monkey 
ANDi, for‘inserted DNA’backward (Saltus 2001). 

Glossaries and systems of naming are helpful even when they are incomplete or 
inconsistent. Although glossaries cannot include terms coined too recently to have 
become established, the lexicon of a field reflects its current working conditions and 
research methods. Glossaries cannot substitute for a broad, deep education. They are 
intended to cover a single field, but modern scientists must know more than their 
single field, and the borders are not clear. Research must be verified by repeated 
experimentation to overcome the possibility of various causes of a perceived result. 
The research methods and the interferences may involve outside influences. Neuro¬ 
science deals with living patients who respond to electrodes in accordance with the 
treatment and also with their own cultures and personalities, individual life histories, 
and interpersonal relationships with the staff. These are not explained in the encepha¬ 
logram. Particle physicists have an even greater problem of not being able to view 
their invisible materials directly; they must reason back from perceived effects to 
create a hypothesis to explain them. 

Scientists find brevity advantageous, even necessary in a complicated statement. 
They economize by using acronyms and nominalizations that may eventually find 
their way into the mainstream, whether they are sufficiently clear or not. Thus scien¬ 
tists get a reputation for being hard to understand. Only those who know a field and 
its history recognize the names of the predecessors whose shoulders they stand on; 
they find eponyms relevant and easy to remember, although they are sometimes only 
honorary, sometimes very precise, sometimes not. 

Physicists love to tell stories of gaining insights during Eureka moments when they 
shout,‘Aha!’ These moments often occur away from the lab, perhaps during a vacation 
when the scientist’s mind is still almost unconsciously playing with the situation. The 
whimsical names that physicists coin show that they are clever human beings who 
enjoy puns and other sorts of humor. Physical scientists in all fields enjoy exchang¬ 
ing what they call ‘engineer jokes’ at family gatherings, other social affairs, and on a 
website (www.dctech.com/physics/humor.html). To be creative, scientists must use 
their imagination. Modern geeks are not all lonesome nerds, despite the stereotype. 
Creative researchers realize that teamwork and language are important. They enjoy 
their work and welcome new information, just as I enjoyed doing this research. 
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The Higgs boson is referred to informally as ‘the God particle’ because this hypothetical 
subatomic particle is believed responsible for all mass in the universe. It is named after 
the British physicist Peter Higgs, who theorized about relevant interactions. Bosons are 
particles that behave in accordance with Bose-Einstein Statistics. S. N. Bose (in India, 
1894-1974) discovered a statistical rule and informed Einstein, who built on it to predict 
the Bose-Einstein Condensate (BEC, which may contain pointed excitations called sky- 
rmions). If found naturally, BEC could be used to study supernovae. Boson was coined in 
1947 when Dirac combined the name of Bose with ion. 

The ‘spooky’ EPR paradox was named with the initials of the originators, Einstein, Podel- 
sky, and Rosen (Gribbin, 1998a, 126-27). 

Electr-: L from Gk‘amber,’ which attracts bits of chafF when rubbed. In 1600, Gilbert dis¬ 
tinguished electricity from magnetism by adapting the Latin word that Roger Bacon had 
used around 1250 as an adjective for electric (Fahnestock 1999). 

Immunohistological staining: a neuroanatomical technique for studying synapse. 
Sympathetic: Gk'with’ + ‘emotion’; inhibiting physiological effects. 

Spin has three different physics definitions in AHD (2000). The new field of spintronics 
uses spin to carry information. Cf. chirality, from L from Gk for ‘hand,’ the left or right 
handed orientation of the spin of a particle. 

Excited state: ME from L'set in motion’; physics metaphor, raised to a higher energy level. 
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from its beginning, structural linguistics, in most of its many forms, has given short 
shrift to the study of written language. Saussure (1959:23 [1915]) says that ‘language 
and writing are two distinct systems of signs; the second exists for the sole purpose of 
representing the first.’ Sapir (1949:20 [1921]) says that ‘written language is... a point- 
to-point equivalence... to its spoken counterpart. The written forms are secondary 
symbols of the spoken ones—symbols of symbols...’ Bloomfield (1933:21) says that 
‘writing is not language, but merely a way of recording language by means of visible 
marks.’ Hockett (1958:4) says that ‘the linguist distinguishes between language and 
writing’ [italics in the original]. Martinet (1964:17) says that ‘the study of writing is a 
discipline distinct from linguistics proper’ 1 . 

A few structural linguists do discuss writing more extensively. Gleason (1955:301) 
includes two chapters on writing, but says that ‘many linguists consider all forms of 
writing [to be] entirely outside the domain of linguistics [...] [although] many of the 
same methods of study can be used for dealing with both, and the structures revealed 
are in many respects similar’. Trager (1972:19), devotes half of his book to writing sys¬ 
tems, but says that ‘language comes first, and writing is at best a secondary symbol 
system of recent development’ 2 . 

Structural linguists’ insistence that they should study spoken material rather than 
written material 3 may have arisen from a reaction to the practices of the Neogram¬ 
marians and to the popular and school-grammar idea that written language is better 
than spoken language, but it would have been supported by certain statements in the 
works of Plato and Aristotle that are part of our Western cultural heritage. I want to 
suggest that these things which these philosophers said have become uninvestigated 
commonplaces 4 of our linguistic thought; because the theme for this year’s forum is 
‘What constitutes evidence in linguistics?’, I want to look at these things that they said 
and consider whether they are valid evidence for the nature of language. 

1. plato. Plato’s statement that is usually cited in opposition to a distinct linguistic 
status for writing occurs in his ‘Phaedrus’ (Plato 19523:138-39), where he tells the 
Myth of Thoth. Plato says that he has heard a story about Ihoth, the Egyptian god 
of writing, mathematics, and magic. In this story (as Plato tells it) Thoth tells the 
king of Egypt that he has invented letters, and tells the king that these letters will 
be a good thing for both memory and wit. The king tells Thoth that the letters which 
he has invented are a bad thing, and that letters ‘will create forgetfulness in the learn¬ 
ers’ souls, because they will not use their memories’, and that people who use letters 
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‘will have not truth, but only the semblance of truth’, and that they ‘will have the show 
of wisdom without the reality’. Plato then goes on to say that writing is like a paint¬ 
ing, which always appears the same but cannot answer questions that are asked of it, 
unlike ‘an intelligent word graven in the soul of the learner’, which can do so. Plato’s 
interlocutor asks him whether he refers to ‘the living word of knowledge which has 
a soul, and of which the written word is properly no more than an image’, and Plato 
(i952a:i39) says that that is indeed what he means. Plato (1952:140) also says that writ¬ 
ing can be only a reminiscence of what we know, and that clearness, perfection, and 
seriousness are to be found only in those principles of justice, goodness, and nobility 
that are communicated orally and are graven in the soul. 

Plato expresses the same attitude toward writing in his autobiographical ‘Seventh 
Letter’ (Plato i952b:8io), in which he says that if philosophical truths are to be put into 
words, the words must be spoken, not written, because ‘no man of intelligence will 
venture to express his philosophical views in language, especially not in language that 
is unchangeable, [such as language] that is set down in written characters’. 

To find out why Plato said these things, we must look both at the man Plato and at 
his times. When Plato was in his twenties, Athens went through two political revolu¬ 
tions within one year. It had been a democracy ruled by all of its citizens (that is to 
say, by all of its adult, free, native-born males), although its democracy was often led 
by members of its old, propertied families. In the year 404 b.c.e., Athens lost a war 
with Sparta, the democracy was overthrown, and power was seized by an aristocratic 
oligarchy called the Thirty Tyrants. Then, after a few months, the Thirty Tyrants were 
overthrown, and another democracy was established. 

Plato was an aristocrat descended from some of the oldest and most aristocratic 
families in Athens. The Thirty Tyrants included both his mother’s brother and her 
cousin, and Plato was asked to join them in ruling Athens, but he declined to do so 
because he saw how violently they were acting. Then after the Thirty Tyrants were 
overthrown and a democracy was restored, he found that the restored democracy was 
acting as violently as his friends and relatives the oligarchs had acted 5 . 

Plato therefore found himself at loose ends. In his ‘Seventh Letter’, he laments that 
he was unable to do what a young man of his background should be doing—namely, 
helping to run his country. Because he realized that a career in Athenian politics was 
not then open to an aristocrat like himself (Plato 19523:801), he decided to travel, and 
as he did so he found his vocation of philosophy. Eventually he returned to Athens 
and opened a school (Gomperz 1955, vol.2:250-52,254-59, 2 70; Guthrie 1975:10-19). But 
even as a philosopher, he remained an aristocrat opposed to democracy. In both his 
‘Laws’ and his ‘Republic’, he envisions ideal governments in which the rulers (who will 
be few) will be philosophers and philosophers (who will be few) will be the rulers. 

What was happening to the use of writing during these times? Simply put, in Hel¬ 
lenic society the general use of writing was a mark of democracy. In earlier times, 
when all the Hellenic city-states were ruled by aristocracies, writing had been limited 
to what has been called ‘craft literacy’. There were some literate people who could be 
hired to write when writing was needed and could be hired to read when reading 



PLATO AND ARISTOTLE VERSUS WRITING 


71 


was needed, but reading and writing were not things that people expected to do for 
themselves (Illich & Sanders i988:22-23) 6 . In early aristocratic Greece, informa¬ 
tion was preserved not by writing, but by memory; and important information was 
preserved only in the aristocrats’ memories. But after Athens became a democracy, 
beginning with Solon’s constitutional revision in the early 6th century b.c.e. (Plutarch 
1952:73), public writing flourished, and everything from laws, treaties, and election 
results to public financial accounts were inscribed on walls for everyone to see 7 . 

A well-recorded example of this sort of thing happened in the history of Rome. 
Linguistically, Rome was not Greek, but culturally it was a Hellenic city, with a popula¬ 
tion that was divided into aristocrats and commoners. At first, information about the 
laws of Rome could be found only in the memories of the aristocrats. Eventually, 
the commoners rebelled at this arrangement; in the year 450 b.c.e. they succeeded in 
having the laws of Rome compiled and inscribed on the Twelve Tables, which were 
posted up publicly so that everyone could read them or else have them read to him 
by some literate friend (Boak 1955:78-79; Heurgon 1973:169-70). But even after the 
substantive law was publicly codified in the Twelve Tables, the procedural law could 
be found only in the memories of the aristocrats, and although a commoner might 
know what the law was, he did not know how to go to court and enforce it. It was only 
in 304 b.c.e. that Gnaeus Flavius, the son of a freedman and a client of the reforming 
aristocrat Appius Claudius, published, with the connivance of his patron, information 
that would let the commoners in on the secrets of how to apply the law (Boak 1955: 
83; Heurgon 1973:197). In Rome, therefore, as in other Hellenic cities, the greater avail¬ 
ability of readable facts about public affairs and the greater ability of people to read 
them were a part of the democratization of government. 

We can see, therefore, that Plato was doing two things when he disparaged writing in 
his ‘Phaedrus’ and in his ‘Seventh Letter’. First, he was not attacking writing rather than 
speaking as a means of communication; he was attacking writing rather than memory 
as a means of keeping records. Second, when he attacked writing, he was not being a 
neutral philosopher dispassionately seeking the truth; he was being a disgruntled aris¬ 
tocrat, complaining that he and his friends were no longer ruling Athens. 

2 . aristotle. Even though Aristotle was Plato’s student, they were very different 
people. Plato throughout his entire life was aware that he was an aristocratic Athe¬ 
nian, and he lived in Athens whenever he could. Aristotle was born into a middle- 
class family in Stagira, a small Hellenic city at the northern end of the Aegean Sea, 
and he did not identify himself with that city. Stagira was located near Macedon, a 
semi-Hellenic kingdom lying north of Greece. Aristotle’s father was court physician 
to the king of Macedon who was the father of Philip II and grandfather of Alexan¬ 
der the Great, and Aristotle grew up around the Macedonian royal court. Eventually 
Aristotle settled in Athens and opened a school there, but only because it was the cul¬ 
tural center of the Hellenic world (Gomperz 1955, V0I.4G9-25). 

Aristotle’s statement that is usually cited in opposition to a distinct linguistic status 
for writing occurs in his ‘On Interpretation, where he says that ‘spoken words are the 
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symbols of mental experience, and written words are the symbols of spoken words’ 
(Aristotle 1952:25). Aristotle may well have said this simply because, at the time that 
he said it, it was an accurate statement of how writing was used for communicating in 
the Greek language. In Greco-Roman times, people normally read aloud, even when 
they read to themselves; they did not expect to read silently. There were exceptions. 
Some passages in the plays of Euripides and Aristophanes assume that silent read¬ 
ing is being done, even though the practice may have been unfamiliar to the plays’ 
audiences (Svenbro 1993:163-64). And there are two well-known Roman examples 
which show how unusual it was then for people to read silently to themselves. Saint 
Augustine says in his ‘Confessions’ that he was amazed to discover that Saint Ambrose 
(the bishop of Milan who died in 397 c.e.) could read to himself without pronouncing 
the words aloud (Augustine 1952:35). And Plutarch records twice an incident in the 
life of Julius Caesar: At a meeting of the Roman Senate during the Catiline conspiracy, 
a letter was brought in and given to Caesar, which he read silently to himself, and 
Cato, who prided himself on being the epitome of Roman virtue, demanded that the 
letter be read out aloud so that everyone could hear what Caesar was up to. This inci¬ 
dent, which was recorded (Plutarch 1952:628,804) because the letter was a love letter 
to Caesar from Cato’s sister, points out how unusual and therefore how suspicious it 
was then for anyone to read a document silently, thereby depriving the bystanders of 
their chance to hear what was written in it. 

It can also be argued that the inscriptions on ancient Greek statues were not 
intended to be read silently, because their grammatical forms only make sense if the 
statues themselves are assumed to be speaking. Apparently they were designed so 
that, if people were looking at a statue and a literate person were present, that person 
would read the inscription aloud, thereby making audible the voice of the statue itself 
(Svenbro I993:ch. 2) 8 . 

There is, however, an important difference between written documents as they 
were produced in Greco-Roman times and written documents as they are produced 
now. In those earlier times, almost all writing was done without any spaces or mark 
between words, in the style called ‘scriptio continua’ 9 . In order to read such a text, a 
reader might have to pronounce strings of letters aloud and use his/her subconscious, 
native-speaker knowledge of the language to try out the various ways of grouping the 
letters of those strings in order to figure out just what words they represented (Sven¬ 
bro 1993:45,166-67). But this would be no trouble for those who read such writing if 
they were native speakers of the language in which it was written. 

The modern practice of regularly writing spaces between all of the words in every 
text seems to have started in Ireland in the 6th century c.e . 10 In those years, there were 
two things about Ireland that made it an unusual place. One was that the Irish were, in 
their time, the only really literate nation in Western Christendom. The rest of Europe 
was going through the real Dark Ages; the Western Roman Empire had fallen to the 
barbarians, and Charlemagne’s efforts to revitalize learning were a couple of centuries 
in the future. In the monasteries of Ireland, however, there was a general practice of 
scholarship; throughout Ireland manuscripts were regularly studied and copied in 
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Latin, in Greek, in Irish, and even in Hebrew (Cahill 1995:158-60,183).‘During several 
centuries it was said that if any man in Western Europe knew Greek he must be Irish- 
born or Irish-trained’ (Hannah 1925:102). 

The other unusual thing about Ireland was that it was the only part of Western 
Christendom which had never been part of the Roman Empire 11 , and the Irish there¬ 
fore spoke a vernacular language that was not a Romance, Latin-derived dialect. In 
most of Western Christendom in those times, people who wrote anything at all wrote 
it in as good Latin as they could manage, and they read by pronouncing the writing 
aloud in their own Romance dialect, which would be close enough to the written 
Latin that they could use their subconscious, native-speaker knowledge to figure 
out what the written Latin words were, even if the text were written without breaks 
between the words. In Ireland, however, everybody who knew Latin had learned it as 
a second language, and no one had a subconscious, native-speaker knowledge of it. 
In order to read written Latin, the Irish therefore had to have texts that were divided 
up into the words which they had learned, so that they could identify those words 
(Illich & Sanders 1988:46). 

Then, in the next few centuries, it was Irish missionary monks who spread Chris¬ 
tianity and literacy across northern Europe (Hannah 1925:104,177-79; Cahill 1995: 
170-71,183-84). They started on the western coast of Great Britain; the part of that 
island which had once been within the Roman Empire had been overrun by non- 
Latin-speaking pagans, and a Romance dialect was no longer the vernacular language 
there. Beginning in 564 c.e. with Saint Columcillie (known in Latin as Columba), 
Irish monks came to Great Britain and christianized it, starting with Northumbria, 
the northernmost English kingdom (Cahill 1995:200), which became the interme¬ 
diate step in the expansion of the Irish literary and scribal tradition to the rest of 
England and thence to the European continent. Many of the leading missionaries 
of northern Europe in the 7th and 8th centuries c.e., including Saint Columbanus in 
France and Italy, Saint Willibrord in the Netherlands, and Saint Gall in Switzerland, 
came from Ireland or from Northumbria (Cahill 1995:188-96, 205-09). (Saint Boni¬ 
face, originally named Wynfrith, was from Wessex, but by his time the Irish influence 
had spread beyond Northumbria to all of England.) Later, Charlemagne would send 
to the Northumbrian capital of York in order to get Alcuin (Illich & Sanders 1988: 
59-60) to come to his court and lead the revitalized schools he was trying to estab¬ 
lish in his empire. And, as these Irish and Irish-influenced missionary monks spread 
Christianity across northern Europe, they took with them the Irish practice of writ¬ 
ing Latin with word breaks, which thereby became established as the usual European 
scribal practice. This practice of writing with word breaks was what made silent read¬ 
ing practicable. 

We can see, therefore, that when Aristotle said that ‘written words are the symbols of 
spoken words’, he was simply describing how writing was used in his own day in the Greek 
language. What he said, however, has since become one of the uninvestigated common¬ 
places of linguistics, and when we have investigated the circumstances under which he 
said it, we see that it does not apply to writing systems which have word breaks. 
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3. conclusion. We see that both Aristotle’s and Plato’s statements which seem to sup¬ 
port the idea that writing is a mere representation of spoken language are invalid or 
are no longer valid. We therefore see that these statements cannot be used as argu¬ 
ments against the idea that we should regard writing as a distinct part of a human 
linguistic system for literate human beings. 


Some of these linguists include short discussions of writing, but only because writing is 
needed for recovering the earlier pronunciations of language or because of spelling pro¬ 
nunciations, which are seen as results of extra-linguistic influence. 

Two groups of linguists in these years did not give the same sort of primacy to spoken lan¬ 
guage; possibly for this reason, they were not usually regarded as ‘structuralists’. The linguis¬ 
tic theory of Hjelmslev and his colleague Uldall, which was given the name ‘glossematics’, 
assumes that languages have highly abstract expression structures which can be represented 
equally well in various media for the sake of communicating. ‘The [expression] elements in 
a linguistic structure may be represented in any way whatsoever, provided only that the ele¬ 
ments required by the structure are kept distinct. The elements may, for example, be repre¬ 
sented graphically, with each element having its own letter. So long as the letters are distinct 
from one another, they may have any shape desired... The elements can also be represented 
phonetically, each element by its own sound, no matter what, so long as it is sufficiendy 
distinct from the others... The manual alphabet of deaf mutes is another special way of 
representing the expression elements of a language’ (Hjelmslev 1970:40-41). Vachek and his 
colleagues, who were known as the ‘Prague School’, assume that any written language has 
separate written and spoken norms. Vachek (1973:9-39) gives a survey of what various lin¬ 
guists have said about the relationship between writing and speaking. 

The difference of opinion as to whether writing is part of language or peripheral to it, and 
therefore whether the term ‘grapheme’ means something inside language or something 
outside language that refers to something inside it, lies behind the controversy in Daniels 
1992, Herrick 1995a, Daniels 1995, and Herrick 1995b. See also Lockwood 2001 and the ter¬ 
minology that is given by Kohrt 1986, to which Lockwood refers. 

Every science probably has its uninvestigated commonplaces: things which ‘everyone has 
always known and which the scientists in that field assume and accept without thinking 
about them. Whenever scientists express new ideas, there is a chance that some of the 
uninvestigated commonplaces of their field will accidentally slip into the ways they for¬ 
mulate their new ideas. It is therefore helpful to any scientific field to point out some unin¬ 
vestigated commonplaces that may slip into its work. An example of a difficulty raised 
by an unexamined commonplace can be found in the history of astronomy: Copernicus 
believed that the Earth and the other planets traveled around the Sun; however, because 
‘everyone knew’ that the heavens, which are composed of the quintessence, are ideally 
pure, and ‘everyone knew’ that the ideal geometrical form is a circle, Copernicus believed 
that their orbits around the sun had to be circular or at least describable by combinations 
of circles. He therefore postulated epicycles for their orbits that were almost as compli¬ 
cated as those of Ptolemy’s Earth-centered system (Koyre 1971:26-27,548-61). It took sev¬ 
eral decades for Kepler to figure out how the Earth and the other plants really go around 
the Sun, and that they travel in ellipses, not in circles (Pannekoek 1961:240-41). 

This became clear when, among many other things that happened, Socrates, who was 
Plato’s teacher, was judicially murdered by the restored democracy. Socrates himself was a 
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stonemason, a middle-class artisan, but he had been the friend and teacher of some of the 
Thirty Tyrants, including one of Plato’s relatives, and the leaders of the restored democ¬ 
racy did not forgive Socrates for his association with those aristocratic oligarchs. 

6 Writing as ‘craft literacy’ functioned very much like present-day stenotypy. Nowadays, if 
one needs to have a record made of what is said in court or in a deposition (and if one 
cannot use a video camera), one hires a stenotypist who has the skill to create that kind of 
written record. But everyone does not expect to be able to do it. 

7 See, for example, the plates in Woodhead 1981. 

8 I once had a student who functioned much like everyone did in Greco-Roman times. She 
had a real problem with studying, because she could not understand her textbooks when 
she read them; she simply could not understand any words that she had perceived by seeing 
them. What she did was to read all of her textbooks aloud into a tape recorder (and after 
doing so she had no idea of what she had read). She then played the tape and listened to what 
she had read aloud, and then she understood what the textbooks said. We now think of this 
as a pathological condition, but people of Greco-Roman times read in essentially the same 
way: they read aloud, and they listened to what they were saying in order to understand. 

9 There were forerunners. In Roman imperial times, monumental inscriptions and the 
works of Virgil were ordinarily inscribed with spaces or special marks between words. 
Saint Jerome (died 420 c.e.) sometimes inserted marks between words in his translations 
from Hebrew into Latin, and an early manuscript of a work by Saint Isidore of Seville 
(died 636 c.e.) put spaces between the words in its headings (Illich & Sanders 1988:46). 

10 The Benedictine monks are often regarded as great copiers of manuscripts and preserv¬ 
ers of literature; but at first this was not a part of their monastic vocation. Saint Benedict’s 
Rule says nothing specific about book-learning, although it assumes that books will be 
available for the divine service and for edifying reading by those monks who are literate 
(Leclercq 1961:15-22,28-30). 

11 It has been said that in these centuries the Irish had a Christian culture which was somewhat 
modified in an Irish direction and that the rest of Western Christendom had a Greco-Roman 
culture which was somewhat modified in a Christian direction (Cahill 1995:148-49). 
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A SOUND-MEANING RELATIONSHIP AS EVIDENCE FOR 
ERROR-CONTROL CODING IN LANGUAGE 


John T. Hogan 
University of Alberta 


pursuant to the theme of the nature of evidence in linguistics, the claim is made 
that the meaning-sound relationship of the linguistic sign shows evidence that the 
lexicon is guided by principles of error-control coding. In this paper, two comple¬ 
mentary and traditional meaning-sound relationships, arbitrariness and iconicity, are 
reviewed, and it is proposed that a third and independent relationship holds between 
sound and meaning for basic linguistic signs, and that evidence for this can be found 
in English thesauri and in a cross-language study of the phenomenon. This third 
relationship follows from principles found in error-control coding, a major branch 
of information theory. 

The seminal work in the area of error-control coding is Shannon and Weaver’s The 
Mathematical Theory of Communication (1949). Cherry 1966 has an extensive survey 
of the ideas of information and coding theory applied to human communication in 
the same year as Chomsky’s Syntactic Structures (which significantly changed the 
direction of research and theorizing in linguistics away from empirical communica¬ 
tion models to more highly formal ‘computational’ models of language). During the 
same time frame, psychologists (Miller 1951, Garner 1962) applied the concepts of 
information theory to psychological processes, and some linguists (Hockett 1953 and 
Gleason 1961) made early attempts to extend the ideas of redundancy, information 
measure and coding to language. In subsequent decades, information theory and 
coding theory have been extensively developed as branches of applied algebra and 
geometry in mathematics and have been applied to communication systems in elec¬ 
trical and computer engineering (Hamming 1980, Hill 1991). However, with the focus 
on other sorts of linguistic description after i960, the application of coding theory 
principles to language was forgotten. 

1. error control codes. Error-control codes serve to combat two main sources 
of message error in a communication network. The first source of message error 
is caused by noise. Noise in its widest sense is anything that destroys a message 
encoded into a signal. Thus noise can be: (1) physical noise such as the humming of 
air conditioners; (2) physiological noise such as hearing loss at the receiving end or 
a speech impairment or slip of the tongue at the sending end; (3) neurological noise 
such as loss of attention due to metabolic changes, or cross-talk due to listening and 
thinking of something else at the same time; (4) sender noise due to misarticulation 
or hypoarticulation by a speaker; or (5) socio-cultural noise due to differences in 
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individual backgrounds. The second source of message error is caused by fading of 
the signal. 

To combat noise, a message can be repeated more than once or some feature can be 
added to it to indicate that some part of it has gone missing. This added feature is called 
an information check (Beckmann 1972). In both cases the redundancy of the message is 
increased along with the length of the message. Some information checks for nouns are 
number, case, gender, and articles; and for verbs, tense, mode, and aspect. 

Fading of the signal, the second source of message error, occurs physically when the 
signal is low in intensity. Psychologically, fading can be viewed as the loss of the signal 
due to limitations of short-term memory. To combat this type of loss, signals should be 
as short as possible or within the range of seven plus or minus two items. The distribu¬ 
tion of word frequencies and length as described by Zipf s distribution (1965:28) follows 
from this. Long words of high frequency usually become shortened rather rapidly. Pro¬ 
cesses such as contractions, and grammaticalization also shorten the message. 

Clearly, the pressures to combat deterioration of messages due to noise and to pre¬ 
vent fading in memory are opposed to each other: the first leads to message length¬ 
ening to increase clarity while the second leads to message shortening to increase 
economy. Thus, there must be optimization trends in communication systems to bal¬ 
ance these pressures. This paper considers evidence for error-control coding as such 
an optimization feature in the mental lexicon. 

2. TRADITIONAL APPROACHES: ARBITRARINESS AND ICONICITY. 

2 . 1 . arbitrariness. In de Saussure’s Course in General Linguistics, the linguistic sign 
is given as an association between sound (phonological) and meaning (content). His 
ideas about the sign, the fundamental unit in his language system, were subsequently 
developed into various semeological theories in Europe. In American linguistics, the 
basic unit comparable to the sign is the morpheme. De Saussure presented several prop¬ 
erties of signs, including linearity, conventionality, arbitrariness, immutability of sound 
structure in the short term and mutability in the long term. The property of interest here 
is arbitrariness, i.e., the notion that the meaning of the sign (or morpheme) cannot be 
deduced from the sound (phonological structure). This idea was first examined in the 
dialogues of Plato. In his dialogue Cratylus, two opposing views are presented. One con¬ 
sists of the thesis that there is a physical connection between sound and meaning—that 
once the meaning is determined the sound structure must necessarily be determined. 
The antithesis is that there is no necessary connection between sound and meaning, 
merely an arbitrary connection, and Plato argues through his dialogue that this is the 
correct relationship. This position was held for two millennia and was repeated by de 
Saussure. De Saussure offered examples from European languages. For example ‘dog’ 
in French, Spanish, Irish, Latin, German, and Russian is chien, perro, madra, canis, 
Hund, and sobaka. The phonological sequences are highly dissimilar for languages 
even from a single typological group and geographically close to each other. 

Arbitrariness allows speakers of a language to innovate new linguistic forms with¬ 
out any constraints, at least from a causal perspective. The number of morphemes 
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in a language can potentially approach infinity However, psychological constraints 
such as limits on the size of verbal long term memory and the counterbalancing role 
of syntax keep the average mental lexicon to roughly 80,000 morphemes and words. 
One of the most important features of arbitrariness is thus the maintenance of open¬ 
ness in the mental lexicon (see Aitchison 1994:7). Another is the freedom to innovate 
new lexical forms, as has been witnessed in conjunction with technological innova¬ 
tions in the past century. 

2.2. iconicity. More recently, a counterpoint to arbitrariness has been heard under 
the title of iconicity. Iconicity plays an important role in discourse, for example in 
the principle that the unmarked order of mention follows the order of actual events. 
Also, the proximity elements in a construct mirror the closeness of relationship in the 
world. Iconicity at this level indicates a correspondence of the syntactic order with 
relationships of mental events. For example, the proximity of kick and wall in The 
horse kicked the wall suggests that the wall was affected more than it would be in such 
a sentence as The horse kicked at the wall, where kick and wall are farther apart. So, 
for English constructions of the above type, affectedness as conceptual distance and 
linguistic distance appear to be correlated (Haiman 1985). 

Moreover, but to a lesser degree, there has been an interest in iconicity as a sound- 
symbolic process. At a basic level, some sounds have a symptom function, such as a cough 
or hiccup, or may indicate emotional states, such as ugh. Still others may have a vocative 
function, such as a throat-clearing or an ahem. These sounds are used to gain atten¬ 
tion and are rarely incorporated in any syntactic construction of a language. At a less 
basic level, certain words can be sound imitative signs such as the imitation of animal 
sounds or sounds in the environment such as rapping or whistling. Words such as 
bow-wow, meow, tap, or hiss are somewhat conventionalized and follow the phono¬ 
logical patterns of a language. However, there are also less conventionalized imitative 
sounds such as howls used by a biologist to evoke responses from wolves or honks to 
attract geese, etc. The latter are not entered in dictionaries by lexicographers. 

Synesthetic sound symbolism is the vocal communication of phenomena from the 
visual, tactile and proprioception properties of objects. For example, the vowel [i] has 
been associated with diminutive referents and the vowel [a] with larger objects, as in slit- 
slot. Hinton et al. 1994 claims that the differences in vocal tract length in the production 
of these vowels is shorter for [i], which is closer to a child’s vocal tract length, and longer 
and more horn-like for [a], which is closer to adult length. Vowel lengthening can also 
be used iconically to indicate size. When the vowel in long is elongated considerably 
beyond its normal length, the suggestion is that an object is very, very long. Another 
example, using fundamental frequency, can be demonstrated with the word high in 
the sentence The moon is high in the sky. If a speaker pronounces high with a high 
fundamental frequency, the suggestion is that the moon is very high in the sky. These 
examples use acoustic signals to indicate the visual property of length and height. 

Iconicity and arbitrariness can be thought of as forming a scale, and cases may fall 
somewhere between the two extremes. Phonesthemic words such as glitter, gleam, 
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glisten, glow, and glimmer clearly contain phonemic segments that are language spe¬ 
cific and carry a common semantic connotation. Rhodes 1994 classifies onomatopo- 
etic words into a scale, from a wild end, with purely imitative sounds, to a tame end, 
with words such as quack, hoot, moo. In the middle he has semi-wild words which 
include many phonesthemic cases. Sound imitative words in this class can be ana¬ 
lyzed into those that are associated with abrupt onsets such as peep, beep, plink, click, 
and creak, which all begin with stops. Words like thump, whack, and yap connote 
sounds with poorly resolved onsets. They begin with fricatives or approximants. The 
vowels in words such as peep/pop, clink/clunk, jingle/jangle are associated with high 
and low frequency sounds, which correlate with the average formant frequencies of 
these vowels, i.e., the average for [i] is higher than for [a]. Last, the final consonants 
or consonant clusters are associated with the decay of the sound. Words such as bong, 
boom, crunch, crash, have long decays, whereas pop, tweet, click have abrupt decays. 
This category of iconic words has a large number of members and it is unclear 
whether they should be analyzed as a recurring partial in morphological analysis. 

The extent of iconicity in the lexicon is an open question. Apart from polysemous 
and morphologically derived forms in the lexicon, arbitrary relationships between 
sound and meaning appear to predominate. 

3. THE SOUND/MEANING RELATION AND ERROR CORRECTING CODES 

3.1. the general principle of error detection. The main thesis of this discussion 
is that there is yet a third relation between sound and meaning. The principle is as 
follows: words that are close in meaning should be phonetically dissimilar, whereas 
words that have no semantic connection may be phonetically similar or even identi¬ 
cal. However, if this principle were a rigidly implemented rule, English would not 
have the antonyms cleave ‘adhere’ and cleave ‘split’ or the confusion between semanti¬ 
cally related lie, lay, lain and lay, laid, laid. This principle, then, must be considered 
only a tendency across languages. The idea behind the principle comes from informa¬ 
tion theory, especially the theory of error-control coding. The goal of such codes is 
to minimize the risks of error by placing signals that could potentially be confused 
far from each other in signal space. For example, the word help does not have many 
actual words close to it. This can be seen by using the phoneme commutation test 
for each segment. The commutation test generates 50 possible words that differ from 
help by one segment. Seven are actual words, namely, kelp, whelp, yelp, held, health, 
hemp, and helm and 43 are nonsense words. It is important that actual words in the 
mental lexicon should be surrounded by nonsense words in signal space. The reason 
for this is that if a word is mispronounced or misheard, then the output or input will 
not result in an actual but unintended word. It will be a nonsense word which will be 
detected by the speaker or hearer as an error and thus the message can be sent again. 
Furthermore, the greater the interword distances are, the less likely it is that a speak¬ 
er’s misarticulations, hypoarticulations, dialectal variations or foreign accent will be 
confused with other words by the hearer. All of these articulations may be interpreted 
as departures from a local norm, but they would not be heard as another unintended 
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word production. It should be noted that syntactic context and the non-linguistic 
context also assist a hearer in error detection plus error correction. If someone yells 
out kelp me in a swimming pool, the hearer may know that kelp is a noun and there¬ 
fore cannot be an imperative. Given the context of the swimming pool, the speaker 
will rapidly deduce that someone needs help. 

Given that actual lexical items should have near neighbors that are nonsense 
words, severe restrictions are placed on the arbitrariness of the sound-meaning rela¬ 
tionships. For example, if an innovated word for a new hue of yellow is required, then 
one should not select yollow, yeelaw, rellow, etc. as the new phonological form, since 
the new related hue may sound like a mispronunciation of yellow, which is not what 
was intended. The word saffron would be a better choice. 

3.2. evidence from thesauri. Where does one see evidence of the above error¬ 
detecting principle? First, one may look in the classical sources of semantically 
similar items, the thesauri. Synonyms, most importantly, but also antonyms and 
hyponyms, form classes of semantically related words whose phonological closeness 
can be examined. Synonyms that are highly semantically similar, polar antonyms that 
are opposites, and hyponyms that are members of a class, should all be phonologically 
dissimilar. For example, 

Synonyms: 

(1) easy: simple, effortless, straightforward, uncomplicated, facile, light, 

smooth 

(2) melt: liquefy, thaw, fuse, dissolve, deliquesce, soften 

(3) crest: top, apex, summit, pinnacle, peak, ridge, crown, head 

Polar antonyms: 

(1) easy: difficult 

(2) melt: freeze 

(3) crest: foot 

A hypernym/hyponym set: 

woodwind-, bassoon, clarinet, cor anglais, English horn, flute, oboe, piccolo, 

recorder. 

By casual inspection, we see that the above semantically related items are all quite 
phonologically dissimilar. Inspection of the Penguin Dictionary of English Synonyms 
and Antonyms, which has roughly 12,000 entries, and the Oxford Thesaurus of Cur¬ 
rent English which has roughly 9,700 entries with 150,000 synonyms, supports the 
major idea that most synonyms, antonyms, and hyponym sets are quite phonemically 
dissimilar to their primary word entry and to each other. These books do not include 
polysemous words which are direct extensions via metaphor or image schema. How¬ 
ever, derivational relatives (paronyms), are entered as synonyms. For example, domi¬ 
nate has domineer as a synonym. Most derivations will necessarily be phonemically 
close in distance and provide an exception to the above coding principle. 
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Homophones are the best cases exemplifying the coding principle that if words are 
phonemically similar then they should be semantically dissimilar. However, before 
inspecting some examples, one has to exclude homophones resulting from a deri¬ 
vational relationship. Words such as entrance and entrants, adulteress and adulterous 
would have to be excluded. Their closeness in phonemic distance is due to the shared 
root morpheme and the homophony of the derivational affixes. 

Examples of non-derivational homophones with their meaning are: 


(1) 

fence 

‘barrier’ 


fents 

‘cloth remnants’ 

(2) 

chute 

‘narrow passageway’ 


shoot 

‘discharge a weapon’ 

(3) 

roe 

‘fish eggs’ 


rho 

‘17th Greek letter’ 


row 

‘propel a boat’ 


row 

‘order series of objects’ 

(4) 

ait 

‘little island’ 


ate 

consumed food’ 


eight 

‘cardinal number’ 


It may be noted that, in many cases, these homophones are members of different 
lexical classes and vary as to frequency of occurrence. Therefore they usually do not 
contrast with each other in the same syntactic paradigm. 

Another class of phonemically similar words is rhymes. In English many rhymes 
are a result of derivational suffixing, compounding of major lexical classes and phon- 
esthemic groupings. The choice of monosyllabic rhymes provides more instances of 
the same principle. Inspecting the rhymes boon, croon, dune, goon, hewn, June, loon, 
moon, noon, prune, rune, soon, spoon, swoon, strewn, and tune,we again note that none 
belong to any recognizable semantic group. However, some do belong to the same 
part of speech. 

Thus inspection of several large glossaries appears to support the general principle 
of error correcting coding—that signals that are semantically close maximize their 
distance across the signal space. They are like lexical magnets of the same polarity. 

3.2. cross-language evidence. Last, we may ask whether the above principle holds 
typologically. Twenty-one dictionaries and one grammar were surveyed in order to 
obtain words for twenty-eight body parts (meronyms). The twenty-three languages, 
including English, were sampled from sixteen macro-families. In a few cases, two 
languages were taken from the same group if they were geographically distant, such as 
Japanese and Turkish from the Altaic group. The languages used are listed in Table 1. 
below. The list of body parts was chosen as a demonstration set of semantically closely 
related words because it was thought that they would be universal. Some body-part 
words out of the twenty-eight could not be used, since they were morphologically 



A SOUND-MEANING RELATIONSHIP AS EVIDENCE FOR ERROR-CONTROL 


83 


complex in some languages or periphrastic in others. Eventually a core list of nine 
words, namely, eye, finger, foot, hair, hand, head, mouth, nose, and teeth were selected 
for comparison. 

A method for measuring phonemic distances among the thirty-six word pairs in 
each language yielded by the set of nine was based on a procedure from Vitz and 
Winkler (1973), who developed an algorithm of simple phonemic template-matching. 
Their procedure can be beset by some arbitrariness when words of different length 
are matched. In this study, the canonical forms of the words in each language were 
adhered to as closely as possible for the pattern match. A convention used by Vitz 
and Winkler for missing segments in a match was to place an asterisk in the empty 
position. For example, when a CV word is matched with a CVC word, the alignment 
would be CV* against CVC. A more involved case of alignment, for example, could be 
a word of CVCVC canonical form against a word of CCVCCV canonical form. There 
are four possibilities for inserting asterisks into the first word, namely, C*VC*VC, 
*CV*CVC, C*V*CVC, and *CVC*VC. The second word would have one asterisk 
inserted after the final vowel, i.e. CCVCCV*. The choice of one of the four depends 
on the phonotactics and canonical forms of the language in question. The resulting 
alignment would have seven positions for each word. Next, after alignment is made 
between two words, a distance is computed by counting the number of phonemic 
mismatches in the pair of words. In this example, if there are no identical phonemes 
for any position, the distance is 7. 

Examples from the data are as follows for the words for ‘foot’ and ‘hand’: Hausa 
ka‘*fa ‘foot’ versus ha‘nnu ‘hand’, a distance of 4; Tibetan kaypa ‘foot’versus la:kpa 
‘hand’, distance = 4; Indonesian kaki ‘foot’ versus tangan ‘hand’, distance = 5 (note that 
kaki must be rendered as ka*ki*); and Tamil kaaladi ‘foot’versus kai ‘hand’, distance = 
5 (note that ‘hand’ must be kai***). 

Table 1 reports the average distance for the thirty-six possible word pairs in each 
language in the first column and the average word length of the nine words in the 
second column. It should be noted that the average interword distance is greater than 
the average word length. This is in part due to the lengthening effect of the Vitz and 
Winkler template-matching technique. However, and perhaps more interestingly, it 
is also due to the fact that in tone languages, two words of equal length and identical 
segmental matches will still have a distance equal to 1 since the tone is different; the 
addition of a different tone does not change the word length. In information theoretic 
terms, this increases the noise resistance of the signal without adversely affecting 
the fading of the signal in memory. It should also be noted that in many languages, 
the body-part set has a mixture of words of different syllable lengths. The Vitz and 
Winkler procedure used to measure distance captures this only through segmental 
mismatches. However, the prosodic difference in syllable count (sonority) produces 
even more marked distance than the metric suggests. This leads to the conclusion that 
the actual perceptual distances among the words in the sets maybe greater than those 
shown in Table 1 (overleaf). 
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Language 

Average 

interword 

distance 

Average 

word 

length 

Arabic 

5.69 

4.56 

Basque 

4.18 

3.89 

Cree 

5.22 

5.22 

English 

4.14 

3-44 

Finnish 

4.82 

4-33 

German 

3-78 

3.78 

Georgian 

3.36 

4-44 

Hawaiian 

4.56 

5.11 

Hausa 

4-57 

4.56 

Hungarian 

3-04 

3.00 

Indonesian 

4.81 

4.89 

Inuit 

4-79 

5-33 


Language 

Average 

interword 

distance 

Average 

word 

length 

Japanese 

3.50 

3.11 

Mandarin 

4.67 

4.11 

Navaho 

3-78 

3.11 

Sara- 

Ngambay 

3.81 

3-00 

Swahili 

3.96 

4-33 

Tara-humara 

3.89 

4.00 

Tamil 

4.19 

4.22 

Tibetan 

4.64 

3.78 

Turkish 

4.14 

3.89 

Wik-Mungkan 

3-58 

3.78 

Yoruba 

3-54 

3-44 


Table 1. Cross-language comparison of average interword distance and average word 
length for nine body-part terms. 

In summary, Table 1 shows us that the sets of nine body-part words range from 
5.69 to 3.04 in average distance, with English falling in the middle of the range, and 
from 5.33 to 3.00 in average word length, again with English in the mid range. For 
words of this length, the average interword distances are clearly very large. This indi¬ 
cates once more, this time using a methodology different from simple observation of 
thesauri, that words in semantically similar sets are widely separated from each other 
in phonetic space in the mental lexicon. 

4. other considerations. Alternative approaches to the determination of distances 
among phonetic categories have been made by Shepard (1972). Using the confusion 
matrices based on masking and filtering experiments by Miller and Nicely (1955), 
Shepard computed distances among 16 English consonants from the confusion 
matrices and applied multidimensional scaling techniques to obtain underlying 
spatial dimensions which appeared to be similar to scalar distinctive features. Analo¬ 
gously, this process could be carried out for the 82 consonants and 25 vowels of the 
i.p.a. Subjects would have to be phonetically trained practitioners of the i.p.a. From 
masking experiments with several noise varieties and filtering with various settings, 
confusion matrices could form a standardized basis for estimating phonetic distance 
and calculation of dimensions of phonetic space. In another approach, Shepard also 
applied the techniques of hierarchical clustering analysis to determine groups of 
similar phonetic types and their interdistances. 

5. conclusion. This study invites us to develop finer measurements for sound differ¬ 
ences and to initiate extensive speculation about the comparison of lexical items in 
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a semantic metric space. This latter would, in effect, be a method to develop a typol¬ 
ogy for semantics, which would then allow us to have a geometry of the lexicon. As 
a visual image, the lexicon could be viewed as a galaxy with the stars as actual lexical 
items, their mass as phonetic substance/semantic substance and their differences as 
the distances between stars. 
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TYPES OF EVIDENCE FOR A REALISTIC APPROACH TO LANGUAGE 


Sydney M. Lamb 
Rice University 


the use of the word ‘realistic’ in the title may suggest that I consider some 
approaches to language unrealistic. The suggestion is intentional. Linguists have com¬ 
monly assumed the existence of various entities that can easily be shown to be fic¬ 
titious or illusory, including especially ‘language’ itself, as that term is commonly 
understood—as representing a ‘system’ shared by members of a community. Such a 
belief has serious problems when confronted with reality, of which I shall mention 
just two: first, its assumption that different members of a community share the same 
system; second, its lack of any physical grounding. To consider the first problem 
briefly, it can easily be shown that every person’s linguistic system is different from 
that of every other person. On the assumption that this point is fairly clear, I shall not 
elaborate here. But its consequences must be made explicit: It makes no sense for lin¬ 
guistics to consider as its goal the description of languages as such, since such objects 
have no existence in the real world. What does exist, then? The linguistic system of 
the individual human being. Each person has one, different from that of every other 
person, as already mentioned, and often including information commonly consid¬ 
ered to belong to different languages. 

Now what about the second problem, that of a physical grounding? The problem 
becomes altogether different as soon as we accept that our real object of study is the 
linguistic system of the individual, for such a system has a clear physical grounding, 
and it is neurological: as 140 years of aphasiology have shown, our linguistic informa¬ 
tion is in our brains, largely in our cerebral cortices. 

We thus have a conception of what linguists ought to be investigating that differs 
sharply from what many linguists believe themselves to be investigating: not some 
illusory shared disembodied system, and certainly not the system of the ‘ideal 
speaker-hearer’, for that, too, is something which does not exist, but the linguistic 
system of a representative person, a neurocognitive system (Lamb 1999). 

Even among the minority of linguists who accept that language has a neurologi¬ 
cal basis, some also believe in the existence of things such as rules of grammar, and 
words or morphemes as objects of some kind. Such rules and similar symbolic infor¬ 
mation would have to be included in that neurological basis; they would therefore 
have to be represented in the brain. But no one has ever found any evidence at all to 
support the belief that they are. The belief that there are words or morphemes or the 
like in the brain comes from an unstated and untenable assumption: that what comes 
out of a person’s mouth must have been inside the person, even inside the person’s 
brain, before it came out—as if the person were some kind of vending machine. The 
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alternative is to suppose that what a person really does is to produce words ‘on the 
spot’—with the consequence that what is internal is not words or morphemes or 
the like, but the means of producing such forms. 

The question we must consider is: what kinds of evidence can we find for the neu- 
rocognitive linguistic system of a person? It seems clear that we should include both 
linguistic evidence, including that relating to language processing, and neurological 
evidence. 

Let us begin with linguistic evidence. There is a great deal of evidence that is rou¬ 
tinely ignored by most linguists. Perhaps most obvious is the fact, observed through¬ 
out every day, that people are able to talk and to understand one another. Yet linguists 
almost uniformly neglect this evidence, working with theories of language that have 
no way of being put into operation for speaking and understanding. 

I shall now argue that consideration of the whole range of linguistic evidence, like 
that mentioned below, makes it clear that a persons linguistic structure is a network, 
a system in which all the information is in the connectivity. The view that a linguis¬ 
tic system is a network of relationships was put forth already by Saussure early in 
the last century, and it was given considerable support by Hjelmslev, in work that 
was unfortunately not widely appreciated (1943/61), partly because it was difficult to 
understand. In fact, this idea was not fully appreciated even by me, although I became 
its champion during the sixties, until several years after I first became acquainted with 
his work in the fifties. In keeping with a corollary of the Whorf principle, that nota¬ 
tion influences thinking, it was only when I started using a notation for depicting 
relationships directly, under the influence of Halliday’s notation for his systemic net¬ 
works, that I was able to appreciate that not only does a linguistic system have a lot of 
relationships among its units, but that when those relationships are fully plotted, the 
units as such disappear, as they have no separate existence apart from those relation¬ 
ships (Lamb 1970, Lamb 1999:53-62). 

Aside from that argument, which I shall not repeat here, there are several additional 
kinds of linguistic evidence to support the view that a linguistic structure is a network 
of relationships. And when you put them all together, in my opinion, the evidence is 
overwhelming; and the alternative view, that uses rules and other symbolic informa¬ 
tion, seems quite unable to handle these kinds of linguistic evidence. It is important to 
appreciate that this view is arrived at and justified purely on the basis of linguistic evi¬ 
dence. It is only during the past ten years that I have seriously investigated whether the 
relational network theory is supported also by neurological evidence. I now enumerate 
some pieces of this linguistic evidence, a baker’s dozen of items: 

1. Coexistent alternative analyses. For example,hamburger (Lamb 1999:233-36). 
The network allows both ham + burger, and Hamburg + er to be present. 

2. Multiple parallel interpretation of (many) complex lexemes (cf. Muller 
2000, Lamb 1999:184-97). For example, the Chinese compound zhong‘cen¬ 
tral, middle’ plus guo ‘kingdom’ is the name for China; but in its interpreta¬ 
tion it also, simultaneously and in parallel, means ‘middle kingdom’. 
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3. Disambiguation of ambiguous words using linguistic and extralinguistic 
context. How connotations operate (Lamb 1999:187-88). 

4. Phenomena involving association, such as literary allusions (e.g., to Hamlet 
by quoting) and Freudian slips. For example, the statement Something is 
rotten in the state of Florida conjures up Hamlet to people acquainted with 
the play. 

5. Degrees of entrenchment of idioms and other complex lexemes— 
accounted for by variability in the strengths of connections. 

6. Gradualness of learning—related to degrees of entrenchment. In the learn¬ 
ing process, connections get strengthened. 

7. Context-driven lexeme selection (unintentionalpuns) (Lamb 1999:190-94). 
For example, the selection of zoom (as opposed to the expected go) in Are 
you ready to zoom to the camera store? (Reich 1985). 

8. The interpretation of puns and other cases of ambiguity. For example, a 
talking duck goes into a bar, orders a drink, as says ‘ Put it on my bill’. 

9. Complex associations in slang lexeme formation. Eble (2000) gives the fol¬ 
lowing example: 

Sometimes sound provides the link in a set. With the popularity of 
African-American comedians came the form ho, a dialect pronunciation 
of whore, for ‘a promiscuous woman’. The same sequence of sounds, 
spelled hoe, refers to an implement for tilling the earth’, i.e. a garden tool. 
Thus ho and garden tool are current slang synonyms for ‘a promiscuous 
woman (Eble 2000: 509). 

10. Slips of the tongue (cf. Dell & Reich 1980). 

11. Prototypicality phenomena. The conceptual category bird, for example, 
includes some members, like robin, sparrow that are more prototypical 
than others, like emu, penguin. The effects have shown up in numerous psy¬ 
chological experiments using such evidence as reaction time for deciding 
whether an item is or is not a member of the category. The relational net¬ 
work model provides a simple and direct means of accounting for the phe¬ 
nomena, by means of two devices which are needed anyway to account for 
other phenomena: variation in the strength of connections (thus the prop¬ 
erty of flying is strongly connected to the category bird), and variation 
in degrees of threshold satisfaction. Strength of activation, strength of con¬ 
nections, and number of activated connections all contribute to the speed 
and degree to which the threshold of a node is satisfied. It is important to 
notice that although these phenomena have been discussed in the literature 
for years, no means of accounting for them other than by means of a net¬ 
work model has ever been proposed. 

12. Realistic means of accounting for speaking and understanding. This 
one is of basic importance, and the widespread neglect of the obvious and 
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widespread evidence—that people are able to speak, and to understand one 
another, in real time—is actually shocking when you stop to think about it. 
How can linguists go on, year after year, neglecting this evidence? Surely, 
the fact that people are able to speak and to comprehend one another (if 
imperfectly) cries out for explanation. The relational network model, whose 
origin over thirty-five years ago was motivated partly by this overwhelm¬ 
ing evidence, provides a simple and direct means of such accounting: by 
the ‘travelling’ of activation through the pathways provided by the network 
(Lamb 1999). How long will the linguistic community continue to suppose 
that the ability of people to speak and to understand requires no explana¬ 
tion? 

13. On-line cognitive processing in conversation. This rich but neglected 
opportunity for study, again blessed by abundant but neglected evidence, 
has been explored by Cynthia Ford Meyer in her two prize-winning lacus 
papers (1991, 1992), in her dissertation, and in a more recent publication 
(2000) (see also Lamb in press). Strangely and sadly, her work has not yet 
encouraged others to undertake similar explorations. Here I will give one 
example, not from her work but from my own analysis (Lamb 1999:202) of a 
courtroom exchange reported by Lederer (1987). 

Attorney: Mrs. Jones, is your appearance here this morning pursuant to a 
deposition notice which I sent to your attorney? 

Witness: No. This is how I dress when I go to work. 

We can observe a number of phenomena that are readily accounted for by the rela¬ 
tional network approach. The witness is evidently concerned about her appearance 
and believes that a woman’s clothing contributes to her appearance. Beliefs are regis¬ 
tered as conceptual subnetworks, and matters of present or ongoing concern register 
as weak activation in these networks. Such activation is increased by emotional 
stimulation. To this factor we add another: unfamiliar lexemes or locutions are likely 
not to provide much conceptual activation, if any, because the connections that would 
provide activation are weak or lacking. So the lexeme pursuant and the possibly unfa¬ 
miliar expression pursuant to a deposition notice, although they were surely received 
by her phonological recognition system, probably didn’t generate much activity in 
her lexico-grammatical system, therefore little or none in her conceptual system. 
In addition, any emotional affect aroused by someone’s seeming to draw attention 
to her appearance would deflect attention that might otherwise be directed toward 
attempting to understand the passage beginning with pursuant. The factor of atten¬ 
tion has a global effect on degrees of threshold satisfaction. As a result, that latter part 
of the sentence, which in an attorney’s cognitive system provides strong contextual 
activation to one interpretation of the lexeme appearance (the intended one), fails to 
have such an effect in the woman’s system, and the other interpretation has in any case 
already been activated by the time the phrase beginning with pursuant was received. 
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There is an opportunity for many more fruitful studies along the lines opened up 
by Meyer (1991,1992,2000). 

In any case, this brief survey suggests that considerable linguistic evidence exists 
for the network model, impressive in its abundance and variety. And we should 
not overlook that these are all real phenomena that therefore must ultimately be 
accounted for by linguistic theory The fact that most linguists have seen fit to neglect 
such data in the past is irrelevant. And, I might add, unfortunate. These phenomena 
all strongly support the network model. No one has ever given an indication that any 
alternative means of accounting for them is available. This fact alone might be seen 
as rather compelling evidence. Those who find it reasonable to believe that the brain 
stores rules of grammar, or words or morphemes or other symbols, or that it operates 
like a computer, need to rethink their position: such a belief must either be supported 
by some kind of evidence, or else it should be abandoned. 

Now we are ready for the next step. As our linguistic systems are represented in our 
cerebral cortices, it would be nice also to have some neurological evidence. We can look 
at what is known from neuroscience to either support or cast doubt upon the model. 
It is also appropriate to consider how the alternative, a symbol-based model, stacks up 
against the neurological evidence. Never mind that it has not been fashionable to con¬ 
sider such evidence in the past. Enough is known now about the structure and opera¬ 
tion of the brain to make neurological evidence part of the arsenal of linguistics. 

Fortunately, there is a large amount of usable neurological evidence that bears on 
these matters. We can divide it into two portions, dealing with ‘macro-structure and 
‘micro-structure’. 

At the level of macro-structure we have considerable evidence from 140 years of 
aphasiology relating to the presence in the cortex of different linguistic subsystems 
and their locations. Let me here mention just two of these findings. First, the hypoth¬ 
esis of stratification of linguistic structure is supported. This is the hypothesis that 
there are different subsystems for different linguistic levels or strata. It is supported 
by findings from aphasiology that, for example, phonological structure is distinguish¬ 
able cortically from lexical structure and semantic structure. The second interesting 
point is that, unexpectedly for most linguistic theories, both of the network and the 
symbolic variety, there is a clear cortical distinction between two subsystems for pho¬ 
nological structure, one for phonological production, the other for phonological rec¬ 
ognition. Thus the hypothesis of stratificational grammar (and other theories) that 
there is a phonological stratal system is seen to be in need of revision, for there are 
two stratal systems for phonology. As it happens, this revision is quite easy to make in 
a network theory, and it produces beneficial consequences for the theory’s ability to 
account for phonological phenomena. 

More important for the basic question of choosing between a network model and 
a symbolic model is the evidence at the level of micro-structure. Here we are con¬ 
cerned with the basic question of how linguistic information is represented in the 
brain. The question can perhaps best be introduced by considering some of the con¬ 
ceivable possibilities. 
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possibility 1.1 mention this one because it was current among lay people when I was 
a child. It was thus the first hypothesis that I considered in my lifelong philosophical 
exploration. It takes note of the many grooves in the cortex, called sulci, and proposes 
that these grooves store information in the way that the grooves of a phonograph 
record store information. Needless to say, no one takes this hypothesis seriously now¬ 
adays. But it is interesting to take note of one very interesting feature of this proposal: 
It results from a hypothesis-forming method that operates by metaphoric extension 
from present-day technology. 

possibility 2. This one results from the same metaphoric process, but as technology 
has changed, it is more modern. Its metaphorical basis is the computer, and the hypoth¬ 
esis is that the brain stores information in much the same way as a computer. In my 
opinion it is nearly as ridiculous as the phonograph record hypothesis, and has no more 
basis for reliability than that the computer is a more recent technology than the pho¬ 
nograph record. But why should Nature have evolved, over the millions of years, an 
information-processing device (the brain) that just happens to use the same technol¬ 
ogy as that which was invented during the latter half of the twentieth century? What a 
remarkable coincidence! Yet just this hypothesis is explored by some of those regarded 
as leading thinkers in neuroscience (see for example Churchland and Sejnowski 1992). 
According to the more naive versions of this hypothesis, information is stored as com¬ 
binations of binary digits, or perhaps as other kinds of symbols. It is easy to think about 
and it fits well with our habits of thinking of information as consisting of symbols repre¬ 
sented physically in some medium, since that is the way we have long been accustomed 
to treating information represented externally to our brains. On paper, on blackboards, 
in computers, information consists of symbols represented in some medium, paper 
or blackboard or electronically. But that doesn’t make the hypothesis correct (Lamb 
1999:114-16). In order to win neurological support, such a hypothesis needs to show 
that neurons or groups of neurons are capable of storing binary digits or other symbols. 
Moreover, it needs to show how such symbols are operated on in such processes as rec¬ 
ognition and production. We know how recognition operates in a computer: it depends 
upon a process of comparison. Given an item to be recognized, any of various strategies 
is used to find a candidate among the items stored in the memory, and then this candi¬ 
date is compared to the input item. If they match, successful recognition has occurred; if 
they do not, another candidate can be searched for; and so forth. Not to be overlooked is 
that such a process requires additional equipment not yet mentioned: a buffer in which 
to store the input item while the process is going on, a workspace in which to perform 
the comparison, and most important an executive device of some sort, a homunculus, 
which executes the process; that device thus requires some kind of knowledge of that 
process and how to carry it out. These features of the hypothesis are easy to overlook 
since we humans execute recognition and comparison of external symbols all the time, 
having learned the operations through thousands of instances of practice. We thus take 
them for granted and have to be reminded that they actually involve a great deal of 
information processing. 
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In any case, considerations like these must enter into the testing of any hypothesis 
of how the brain represents information. 

possibility 3. The third hypothesis I will mention is the network hypothesis. The 
second and the third are in competition, as the only two hypotheses that are seriously 
entertained nowadays. Actually, the network hypothesis comes in several varieties, as 
several forms of network have been proposed. The best known of them is often called 
connectionism or PDP (parallel distributed processing) (Rumelhart & McClelland 
1986), although those terms appropriately apply to a whole range of alternative net¬ 
work hypotheses and not just to this one. That is unfortunate since this well-known 
hypothesis is among the most unrealistic. That is, it is among those most lacking in 
supporting evidence from neuroscience (Lamb 1999:118-19). Even more unfortunate 
is that models of this kind are often referred to as neural network models, using a 
name that suggests a resemblance to real biological neural networks, even though 
they lack such resemblance. On the plus side of the ledger, however, they do share 
with relational networks a basic property that sets them apart from computer models: 
they do not store binary digits or symbols. 

It may be instructive, before proceeding, to see how the network model handles 
recognition. Let us suppose that a word is being received by the system. If it is a 
spoken word, it will activate the nodes for its auditory features, and these will pass 
activation on up to higher-level nodes, perhaps representing phonemes. (I say ‘per¬ 
haps’ because we do not yet know what units the phonological recognition system 
operates with—if not phonemes, some other units, perhaps transitions from one pho¬ 
neme to another—no matter, the process works the same way no matter what units 
are utilized.) These higher-level nodes, those activated by this particular word, in turn 
pass their activation on to a still higher-level node representing the word. It is the 
activation of that node that constitutes recognition of the word. Notice that no buffer 
is needed, nor any workspace, and most important, no executive device or homuncu¬ 
lus. Each node in the network is its own processor, operating on a simple principle: 
when it receives enough activation to surpass its threshold, it passes activation on to 
higher-level nodes to which it is connected. 

In order to test the model against the neurological evidence, we need a hypothesis 
of how the nodes of the network (called ‘nections’ in Lamb 1999) are represented physi¬ 
cally in the cortex. Based on work by Mountcastle (1998), Burnod (1988/90), and Arbib 
et al. (1998), I have adopted the hypothesis that the node of the network is implemented 
neurologically as a cortical column (Lamb 1999:323-26). A cortical column, also known 
as a minicolumn, contains about 80-110 neurons on average (more in the primary 
visual cortex of primates), and extends through the six layers of the cortex. About 70% 
of the neurons in a typical column are pyramidal neurons, and the remaining 30% 
consist mainly of inhibitory neurons of various types and, in layer IV, spiny stellate 
neurons. The pyramidal neurons provide excitatory connections to other columns, 
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either neighboring or distant, while the inhibitory neurons provide inhibition to 
neighboring columns and within the same column. 

We are now at a very important point in this journey, and it is helpful to be fully 
aware of how to use the available evidence as we assess our progress and plan the next 
steps. In keeping with standard scientific practice, it is important to ask certain ques¬ 
tions as a means of testing a theoretical model: 

First, is there any data that this way of looking at things handles better than extant 
models? Does it make better sense of the data than competing models do? From the 
examination of linguistic evidence surveyed above, we have obtained a resounding 
‘Yes!’ Not only has no other model of language ever even attempted to handle most 
of the data considered; it is even difficult to imagine how they could be treated in any 
other way than by means of a network model. 

Second, are there any predictions made by the model that can be tested, either by 
experiment or by observation? Another way of stating this point is to use the concept 
of falsifiability: What kinds of data would falsify the model? We can ask this question 
in the context of the columnar hypothesis. The relational network model requires that 
certain kinds of connections be present among its nodes, and that these connections 
have certain properties. The relevant properties, all arrived at through consideration 
of the linguistic evidence, as detailed in Lamb 1999, may be listed as follows: 

1. Connections carry varying degrees of activation. 

2. Connections can have varying strengths. 

3. Connections get strengthened through successful use (the learning pro¬ 
cess). 

4. Nodes have varying thresholds. 

5. The threshold of a node can vary over time (part of the learning process). 

6. Connections are of two types: excitatory and inhibitory. 

7. Excitatory connections are bidirectional (feed-forward and feed-backward) 
(Lamb 2000). 

8. Excitatory connections can be either local or distant. 

9. Inhibitory connections are local only. 

10. Inhibitory connections can connect either to a node or to a line. 

11. In early stages (pre-learning) most connections are very weak (latent). 

12. A node must contain an internal wait (delay) element (for ordered ‘and’ 
node). 

It is important to keep in mind that all of these rather specific properties of the net¬ 
work are determined by linguistic considerations, not neurological ones. (In fact, the 
relational network hypothesis has been around now for over thirty-five years, and it is 
only during the last ten years that I have undertaken the study of neuroscience.) They 
are properties that are required by the need to account for the linguistic data and lin¬ 
guistic processes, including that of learning. They thus constitute predictions from 
linguistic theory about properties that must be present in the brain, if the relational 
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network hypothesis is correct. In terms of the falsifiability doctrine, if any of these 
properties is not present in the cortical columns of the cortex and their interconnec¬ 
tions, then the hypothesis is thereby falsified. 

What, then, do we find, upon examining the evidence from neuroscience relating 
to cortical columns and their interconnections? What we find is that every one of 
these properties is present in the minicolumns and their interconnections (Lamb 
1999:321-29). By contrast, if we compare the properties of the Rumelhart-McClelland 
model with those of cortical columns and neurons, we find that some of them are 
falsified (Lamb 1999:118-19). 

Let me continue by mentioning two additional kinds of supporting evidence. 

First, with respect to the receptive side of linguistic structure. The major process 
involved here is perceptual. Now, to get detailed neuroanatomical studies of the cortex 
as it engages in the process of speech perception is not possible using any known 
and permissible methods. But the status of what constitutes ‘permissible’ is different 
if we consider other kinds of perception, as they are shared by other mammals. For 
example, cats and monkeys are also endowed with the capability of visual perception, 
and it is considered permissible to examine living brain tissues of cats and monkeys 
(LIubel and Wiesel 1962,1968,1977). Although I don’t personally approve of such pro¬ 
cedures, I will permit myself to mention some of their pertinent results. They find 
that visual perception in cats and monkeys works in just the way predicted by the 
network model for the receptive side of language. That is, it uses minicolumns as 
the basic nodes in a hierarchical network in which successive layers integrate fea¬ 
tures from the next lower layer. Similar findings have come from the examination of 
the primary somatosensory cortex and the primary auditory cortex (cf. Mountcastle 
1998:165-203). As Mountcastle reports (1998:181),‘Every cellular study of the auditory 
cortex in cat and monkey has provided direct evidence for its columnar organiza¬ 
tion’. To be sure, this is indirect evidence, as it concerns auditory perception at lower 
levels than those involved in speech recognition. They haven’t examined the cat’s or 
monkey’s linguistic processing since it is lacking. But it is important in this connec¬ 
tion to observe that neuroscientists do consider it permissible to extrapolate beyond 
the cats and monkeys to the supposition that human visual, auditory, and somatosen¬ 
sory perception works in this same way. It is not much of a leap to suppose that speech 
perception also works this way. 

Finally, we may bring quantitative evidence into the examination. Quantitative 
evidence is commonplace in physics but almost unknown in linguistics. Yet it has 
appropriate applications in linguistics. In particular, it is very important to apply 
a quantitative test of capacity. Such testing estimates the capacity provided by the 
model and compares it with that of actual people, for example vocabulary capacity. 
What we need to ask is whether it is realistic to assume availability of enough latent 
nodes, and in the right places, to get a person through a lifetime of learning. 

Let us consider the area in which we have our phonological representations. Based 
on data from aphasiology and from brain imaging studies, it is reasonable to hypothe¬ 
size that this subsystem is to be equated with Wernicke’s area in the narrow definition 
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of that term; that is, the superior posterior left temporal lobe, including the planum 
temporale. According to the relational network theory, this area needs enough nodes 
to represent all the phonological units that might become known by a person, includ¬ 
ing syllables, phonological words, fixed phonological phrases, in as many different 
languages as a person is likely to be able to learn to speak with a high degree of flu¬ 
ency. A liberal estimate would be fifty thousand per language. If we multiply by twenty 
for a phenomenal polyglot with twenty fluent languages, we get a requirement of one 
million nodes. 

For our falsifiability test, we need to estimate the number of cortical columns avail¬ 
able in this area of the cortex, using neurological evidence. We can make a rough esti¬ 
mate by measuring the cortical surface of the area and multiplying by a reasonable 
estimate of the number of (mini) columns per square centimeter of cortical surface. 
In a typical person, the area in question includes the posterior portion of the superior 
temporal gyrus, extending also into the Sylvian fissure (the temporal plane) and the 
superior temporal sulcus, perhaps also into the middle temporal gyrus. The horizon¬ 
tal extent, as might be measured along the top of the superior temporal gyrus, might 
be three cm or more in the typical individual, and the extent in the orthogonal direc¬ 
tion might include at least two cm of the temporal plane (in the Sylvian fissure), one 
cm or more for the superior temporal gyrus, and two cm for the superior temporal 
sulcus. For these rough measures we get a surface area of three or four cm by at least 
five cm, or fifteen to twenty or more square cm. The density of neurons is around 
eighty to one hundred thousand per square mm of cortical surface. To get the number 
per square cm we multiply by 100. But to get the number of (mini)columns, at 100 
neurons per column, we divide by 100. So, as these two factors cancel each other, the 
figure of neurons per square mm is approximately the same as the number of col¬ 
umns per square cm. Using the figures at the ends of both ranges, we get 

from 15 cm 2 x 80,000 columns/cm 2 = 1.2 million columns 

to 20 cm 2 x 100,000 columns/cm 2 = 2.0 million columns 

Thus we get from 1.2 million to 2 million nodes in this area, in a typical individual, 
let’s say somewhere in the neighborhood of one-and-a-half million. There is an alter¬ 
native estimate of six hundred columns per square millimeter—60,000/sq cm—of 
cortical surface, that would give between 0.9 million and 1.2 million. Either way the 
figures are rough but close enough for our purposes. We are evidently in the range of 
one to one-and-a-half million. 

And so the quantities match up well—we have a requirement for one million 
nodes, and we have a capacity of around one to one-and-a-half million nodes—and 
this requirement is for a very liberal estimate of a person with phenomenal linguistic 
abilities. On the other hand, such a polyglot probably has a larger phonological rec¬ 
ognition area than the average person, perhaps extending into the middle temporal 
gyrus. And actually, I have overestimated the requirement, since I have added together 
all of the requirements for phonological forms in all of the languages, disregarding 
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that many of them overlap from one language to another and can therefore share the 
same nodes—forms like those for taxi, coffee, and numerous technical terms. 

The hypothesis seems to be supported. We have given it a test in which it could 
easily have been falsified, and it has passed. 

Since there are those in the field of neurolinguistics who think it possible that 
the linguistic system of the brain is a symbol-based system, it is fair to ask whether 
they have conducted such a test for such models. They have not. Yet their faith per¬ 
sists. Why? I can ask this question, but I have no answer. But if we estimate on the 
basis of how lexical information is usually conceived of in such models, we have our 
figure of capacity for one million (from above) to be multiplied by the number of 
minicolumns or neurons needed to store each item. If the items are represented as 
combinations of distinctive features and if there are on average 40 features per lexeme 
and if it takes 100 neurons (one minicolumn) to store each feature, then we have a 
need of: 

1 million lexemes x 40 columns per lexeme = 40 million columns 

To emphasize how outrageously excessive this number is, we need only remind our¬ 
selves that 40 million columns amounts to about 4 billion neurons. Thus the symbol- 
based approach is decisively falsified. This is quite apart from the fact that no one 
has ever shown how a minicolumn could be used to store information. On the other 
hand, the proponents of such a theory might argue that it is the neuron rather than 
the column which stores the phonological features. But that theory is also fraught 
with problems. Presumably they would have to assume that it is the pyramidal neu¬ 
rons that have this function; the area in question has about 70 to 80 million pyramidal 
neurons. But there would be much to explain, for some of these pyramidal neurons 
are in the upper layers of the cortex, some in the lower layers, and the two groups 
have different connectivity. Also, with no redundancy, how does the system continue 
to operate in cases of occasional neuronal malfunction and death? Not to mention 
other problems, not least of which is that no one has ever come up with a reasonable 
theory of how a neuron can be used to store information. 

Another quantitative test concerns the arcuate fasciculus, a fiber bundle which con¬ 
nects the phonological recognition area to the phonological production area. The rela¬ 
tional network model predicts that this bundle has to have many thousands of fibers, 
perhaps hundreds of thousands, since it requires unique connections from all low-level 
nodes in the one phonological system to nodes in the other (Lamb 1999:366-67). Sym¬ 
bol-based and computational models, on the other hand, require only a few hundred 
fibers in the bundle, even with extensive redundancy. Here we have a test that sharply 
distinguishes the competing models, one which will clearly falsify the one or the other. 
And this prediction of the network model concerning the size of this fiber bundle really 
is a prediction, since up to now the number of fibers in the bundle has not been counted 
nor even reliably estimated in print. We should soon have results of this test—we await 
publication of a neuroanatomical study of this fiber bundle. 
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There is also quantitative evidence relating to processing. We know that people 
are not only able to speak and to understand, they are able to do so in real time. 
Any model has to meet the test of being compatible with this fact. With the network 
approach, such processing is straightforward, consisting of the spread of activation 
along pathways of the network, governed by the thresholds of nodes - and the model 
agrees with how recognition works in the primary visual, somatosensory, and audi¬ 
tory cortices. Contrast how recognition has to work in symbol-based systems, even 
if a plausible hypothesis were forthcoming about how neurons could store symbolic 
information. Again, no supporter of symbol-based systems has ever proposed, to my 
knowledge, any explanation of how such a system could be used in real time to recog¬ 
nize speech. In the absence of such a proposal, we may view the formidable difficulties 
of devising such a model as highly suggestive. 

To sum up, I have mentioned several kinds of linguistic evidence, usually over¬ 
looked, which suggest that a persons linguistic system is a network. I have summa¬ 
rized, with references to the literature, the results of work examining these kinds of 
data, that has led to the determination of a set of specific testable properties of such 
networks. In the context of the hypothesis that the nodes of a relational network are 
implemented as cortical minicolumns, all of these propeties of relational networks 
are found to be present in the cortex. Additional confirmation is provided by a test 
of the capacity needed for the vocabulary of a polyglot. A further test based on the 
number of fibers in the arcuate fascilulus awaits neuroanatomical confirmation. 
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it is certainly true that corpus-based methods have become mainstream over the 
last years, particularly in English linguistics. Both the compilation of large and rep¬ 
resentative corpora and the computer-based analysis of such corpora in quantitative 
and qualitative terms have led to insights into actual language use which could not 
be obtained from invented and decontextualised examples or intuition-based judge¬ 
ments alone 1 . In the present paper, particular attention will be paid to the corpus- 
based description of patterns of usage and their relevance to a truly usage-based and 
cognitive grammar of English. In general, I will suggest that quantitative data and 
their careful qualitative interpretation be included in future attempts to model speak¬ 
ers’ linguistic knowledge. This theoretical issue, referring to the widening scope of 
corpus evidence, will be illustrated with a sample analysis of corpus data on the fre¬ 
quencies and distributions of related lexicogrammatical patterns of the verb provide. 

l. patterns and routines in language use. To begin with, I would like to sum¬ 
marize what corpus-based research has brought to light with regard to authentic lan¬ 
guage use. Generally speaking, the in-depth analysis of large amounts of data reveals 
that language use is largely based on routines and patterns. Some twenty years ago, 
Pawley and Syder (1983:193) pointed to the fact that native-like language use is not 
only characterised by creativity, but at least to the same extent by routine: ‘The prob¬ 
lem we are addressing is that native speakers do not exercise the creative potential of 
syntactic rules to anything like their full extent, and that, indeed, if they did so they 
would not be accepted as exhibiting nativelike control of the language’. 

After several decades of research into ever larger corpora, this hypothesis has been 
confirmed. This not only holds for what Pawley and Syder (1983:192) call ‘lexicalized 
sentence stems’, but for many other aspects of language use as well. Collocations such as 
vested interest (s) and stinking rich (cf. Partington 1998:26) and collocational frameworks 
such as all + determiner + time (cf. Lenk 2000:189-91) show that in actual usage 
the combination of words is not at all free. Not only do we encounter lexical co¬ 
occurrences, but also ‘the co-occurrence of grammatical choices’ (Sinclair 1996: 
85), i.e., colligational patterns. For example, naked eye tends to be preceded by the 
definite article, a preposition and a verb or an adjective as in visible to the naked eye 
(cf. Sinclair 1996:85-86). On this basis, Hunston and Francis (2000) suggest a‘pattern 
grammar’ approach to the description of frequent lexicogrammatical co-occurrences 
of all kinds. For example, difficult is shown to occur significantly often in the pattern 
be + difficult +for + noun group + to-infinitive (cf. Hunston & Francis 2000:131). 
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Finally, corpora have shed light on semantic patterns in language use. Stubbs (cf. 
1995:26,45), for example, finds out that provide tends to be used in positive contexts, 
whereas affect is shown to have a so-called negative ‘semantic prosody’. Also, there are 
semantic restrictions on specified word-forms of a given lexeme. For instance, Esser 
(2000:97) points out that in the 100-million-word British National Corpus (BNC), 
the singular form tree is attested with both the meaning ‘plant’ and the meaning 
‘drawing’. On the other hand, the plural form trees is semantically restricted to the 
meaning ‘plant’ 2 . 

In general, such routines and patterns in language use are not found by intuition 
and introspection, but only by the examination of large amounts of data. Note that all 
linguistic routines mentioned are derived from frequencies and distributions in text. 
These linguistic routines are typical of language use, i.e., language in performance. 
The basic question I would like to address is the following one: what may corpus data 
on language use tell us about speakers’ internalised knowledge of the underlying lan¬ 
guage system? 

2. A CORPUS-BASED MODEL OF SPEAKERS’ LINGUISTIC KNOWLEDGE. At first sight—and 

in generative terminology—the aforementioned research question may seem to be 
an attempt to relate language in performance to linguistic competence. However, I 
am deliberately abstaining from using the generative concepts here. A model of the 
language system that is based on corpus evidence has not much in common with a 
generative model of competence. Thus, it is not very useful to take over and extend 
or redefine the term competence, which would automatically lead to terminological 
confusion (cf. Taylor 1988 passim). Rather, I prefer to speak of a corpus-based model 
of speakers’ linguistic knowledge. 

There are, at least, three fundamental differences between the generative approach 
to competence and a corpus-based model of speakers’ linguistic knowledge. First, 
generative grammar focuses on the knowledge of what is possible in language. 
The range of what is possible is mainly identified on the basis of intuition-based 
grammaticality judgements. Second, the focus is on an ideal speaker-hearer. Third, 
as Chomsky himself has repeatedly pointed out, frequencies in text are considered 
irrelevant to competence, that is the internalised knowledge of grammar (see the 
interview B. Aarts (2000:5-7) conducted with Chomsky). On the other hand, a 
corpus-based model would be based on language used by real speakers in authentic 
contexts. It takes into account frequencies in text because the model is also intended 
to mirror speakers’ internalised knowledge of what is probable. The knowledge of 
linguistic routines and patterns includes the ability for use, i.e., the knowledge of prin¬ 
ciples and factors which are responsible for those routines and patterns. This ability 
for use corresponds to what Chomsky (1980:224) himself describes as ‘pragmatic 
competence’, which he, however, clearly separates from competence proper, namely 
‘grammatical competence’. 

A corpus-based model of speakers’ linguistic knowledge would not separate 
grammatical and pragmatic competence. More importantly, it is an attempt to bridge 
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the gap between what speakers know and what speakers use. I follow in this regard 
Halliday’s (cf. 1991:31) view that system and use are inseparable, that in fact language 
use is an instantiation of the language system. A model of speakers’ knowledge of that 
system should therefore account for observable language use as attested in corpora. 
From a complementary perspective, by carefully observing language use we may 
catch a glimpse of the language system and speakers’ knowledge of that system. 

Considering the general outcome of more than thirty years of corpus-linguistic 
research, it appears to me that a corpus-based model of speakers’ linguistic knowledge 
should be able to account for the following characteristic features of language use as 
attested in corpora. First, some linguistic forms are more frequent than others in lan¬ 
guage use, and some formally possible forms are unlikely to occur at all. It is reason¬ 
able to assume that speakers know, as it were, about such probabilities of linguistic 
forms and their combinations. Second, we find linguistic routines and patterns of dif¬ 
ferent kinds (see section 1) so that speakers’ linguistic knowledge not only allows for 
infinite use but is based on routine as well. Third, quantitative data on the frequencies 
and patterns in text can usually be explained by functional and context-dependent 
principles and factors. These principles and factors then seem to be part of speakers’ 
linguistic knowledge: language encoders are obviously guided by such principles and 
factors to make appropriate use of their linguistic means and to adhere to regular 
expectations in their linguistic behaviour. In my view, this observation should trans¬ 
late into a model which ascribes to whatever is frequent in language use a status that 
is different from whatever is rarely used. Fourth, lexical and grammatical choices are 
interdependent in language use. The all-pervading nature of colligations and lexico- 
grammatical patterns calls into question the autonomy of syntax. A model of speak¬ 
ers’ linguistic knowledge that is supposed to account for actual usage should take—as 
Flalliday (1991:31) puts it—Texicogrammar as a unified phenomenon’. 

To summarise, a corpus-based model of speakers’ linguistic knowledge would 
be data-oriented and frequency-based, functionalist and lexicogrammatical in 
nature. Basing a description of linguistic knowledge on quantitative data obtained 
from corpora would be a good example of what Kemmer and Barlow (20oo:x) call 
a ‘usage-based model’: ‘This idea of the fundamental importance of frequency... 
sharply distinguishes usage-based models from other approaches in which frequency 
is an insignificant artifact, unconnected with speakers’ linguistic knowledge’. The 
foundations of a corpus-based (and, thus, usage-based) model of speakers’ linguistic 
knowledge can also be easily reconciled with Langacker’s (1987,1999,2000) cognitive 
grammar approach. In fact, the term ‘usage-based model’ was first used by Langacker 
(1987:494) and defined as follows: ‘Substantial importance is given to the actual use 
of the linguistic system and a speaker’s knowledge of this use; the grammar is held 
responsible for a speakers knowledge of the full range of linguistic conventions, 
regardless of whether these conventions can be subsumed under more general state¬ 
ments. [It is a] nonreductive approach to linguistic structure that employs fully artic¬ 
ulated schematic networks and emphasizes the importance of low-level schemas’. In a 
similar vein to Halliday (see above), Langacker regards system and use as inseparable 
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and speakers’ linguistic knowledge as a knowledge based on—and derived from—the 
actual use of the system. It is obvious that corpus evidence can play a major role 
in such a usage-based cognitive grammar: corpora are samples of ‘actual use of the 
linguistic system’; the‘schematic networks’,‘low-level schemas’ and‘linguistic conven¬ 
tions’ correspond largely to the lexicogrammatical patterns and routines that can be 
identified by drawing on corpus data. 

In the following section, I would like to delve more closely into some of the lexico¬ 
grammatical patterns of the verb provide in order to illustrate the way in which a cor¬ 
pus-based model of speakers’ linguistic knowledge may help bridge the gap between 
the analysis of actual corpus data and the modelling of language cognition. In parti¬ 
cular, I will try to show that the cognitive grammar approach may profit considerably 
from the consideration of corpus data, thus putting into operation Schmid’s (2000:39) 
‘From-Corpus-to-Cognition Principle: Frequency in text instantiates entrenchment 
in the cognitive system’. 

3. from corpus to cognition: a sample analysis. The verb provide is associated 
with an argument structure the semantics of which, according to Goldberg (1995:49), 
can be described as ‘x cause y to receive z’. I would like to refer to the three cor¬ 
responding semantic roles as the acting entity (X), the affected entity (Y) and the pro¬ 
vided entity (Z) respectively 3 . In this argument structure, the verb provide has four, 
formally different patterns 4 : 

• the ditransitive pattern: V - n - n, 

• the with- pattern: V - n - with - n, 

• the/or-pattern: V - n 2 - for - n 2 

• the fo-pattern: V - n 2 - to - n l 

It should not go unmentioned that some colleagues (especially American native 
speakers) have raised objections to my treatment of the fo-pattern as a lexicogram¬ 
matical pattern in its own right. However, as I have argued elsewhere (cf. Mukherjee: 
in press), the fo-pattern is a valid pattern which is structurally and semantically anal¬ 
ogous to the/or-pattern. Consider the following two examples, displaying the use of 
the two patterns in very similar contexts 5 : 

(1) ...shall provide technical assistance and funds to States for training for 
public safety officials ... (frown H15 22-23) 

(2) In carrying out the requirements to provide technical assistance and funds 
for training,... (frown H15 151-152) 

In both cases, a computerised parsing on the basis of the tosca (Tools for Syntactic 
Corpus Analysis ) scheme (cf. van Halteren and Oostdijk 1993:145-62) would result 
in a direct object realised as a noun phrase and a subsequent adverbial realised as a 
prepositional phrase (introduced by to or for). Of course, there are some instances 
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pattern of 
provide 

v- n r n 2 

V-n^with-^ 

V-n 2 -for-n 1 

V-nj-fo-nj 

£ tokens of 
provide 

LOB 

0.0 % 

6.0 % 

15.1 % 

3.0 % 

398 

FLOB 

0.0 % 

5.9 % 

15.0 % 

5.7 % 

540 

BROWN 

0.6 % 

6.9 % 

16.9 % 

4.7 % 

508 

FROWN 

0.7 % 

5.9 % 

15.9 % 

7.5 % 

577 

BNC 

0.0 % 

6.1 % 

14.8 % 

3.8 % 

22,312 


Table 1. Relative frequencies of four patterns o/provide in five corpora. 


of provide which superficially resemble the fo-pattern but in which the prepositional 
phrase is a postmodification within the noun phrase which, in its entirety, functions 
as a direct object: 

(3) ft provides an excellent guide to inter-agency co-operation... (flob H09 
no) 

Such instances of provide, which the tosca parser would analyse differently, have 
not been taken into account. Interestingly enough, some British native speakers have 
objected to the inclusion of the ditransitive pattern (but not of the fo-pattern). This 
does not come as too much of a surprise since it has often been suggested that the 
ditransitive use of provide is restricted to American English (cf. e.g., Quirk et al. 
1985:1210). In a wider setting, both objections reveal the need for as little human inter¬ 
vention as possible in the collection and analysis of data (e.g., by means of automatic 
parsing) due to the subjectivity and unreliability of intuition. 

Four i-million-word corpora have been searched for the four patterns under 
discussion: the Lancaster-Oslo/Bergen Corpus (lob) of written British English with 
texts from 1961, the Freiburg lob Corpus (flob) with texts from 1991/92, the Brown 
Corpus (brown) of written American English from 1961 and the Freiburg Brown Cor¬ 
pus (frown) with texts from 1991/92. Also, the 100-million-word British National 
Corpus (bnc) of spoken and written British English from the 1990s has been taken 
into consideration. Table 1 gives the relative frequencies of the four patterns in rela¬ 
tion to the total number of occurrences of provide in each corpus. 

Applying the chi-square test, the relative frequencies of the patterns in the five 
corpora turn out to be very stable, the differences being statistically insignificant. In 
particular, there is no significant diachronic change (nor, in fact, a regional variation), 
which is at odds with the following hypothesis put forward by Hunston and Francis 
(2000:97): Although provide is typically used with the pattern V n with n (“provide 
someone with something”), there are a handful of occurrences in the Bank of English 
of “provide something to someone”(the pattern V n to n), presumably by analogy with 
give’. However, it is possible to account for the stable distribution of the patterns at 
hand from an entirely synchronic and functional point of view. As the ditransitive use 
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Figure 1. A constructional and lexical network of provide. 

of provide (i.e., without preposition) occurs sporadically in the two American Eng¬ 
lish corpora only, I would like to concentrate on the remaining patterns of provide, 
namely the with- pattern, the/or-pattern and the fo-pattern. 

Drawing on Langackers (1999) constructional and lexical networks, the lexico- 
grammar of provide can be visualised somewhat simplistically as in Figure 1, which 
shows that the three constructions under discussion are associated with specific 
verbs. Also, the arrows indicate that not only do specific verbs choose specific con¬ 
structions, but that specific constructions select specific verbs in return. Needless to 
say, provide occurs in other patterns as well, as in the pattern ‘V n (where provide is 
followed by a noun group only) and the pattern ‘V that’ (where provide is followed by 
a fhaf-clause). Those patterns, however, are left out of consideration for the purpose 
of this paper. With regard to the occurrence of one and the same lexeme in different 
constructions, Langacker (2000:35) notes that‘elements are always shaped by the con¬ 
texts in which they occur’. However, I will not get into details about the interesting fact 
that all the patterns of provide choose a restricted lexis, that is that the constructions 
may be regarded as carrying an abstract meaning themselves. This important aspect 
has spawned a vast literature over the last years, particularly in construction grammar 
with its focus on argument structure (cf. Goldberg 1995 passim). 

The point I am making is that the plausibility and scope of any cognitive framework 
could be increased considerably if information obtained from corpora were included. 
The following examples are intended to illustrate the different kinds of information that 
could and should, in my view, be included. Specifically, this corpus-based information 
makes it possible to describe not only which patterns are available, but also which prin¬ 
ciples and factors are at work in the procedure of pattern selection. 

Consider examples (4) to (6): 

(4) .. .your study will provide you with the knowledge that is generally 

accepted... (bnc eeb 179) 
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(5) .. .providing teachers with scientific resources and project materials... 
(flob H12 216) 

(6) .. .which is to provide the nation with food of the highest quality... 

(lob hio 65) 

In more than 90% of all occurrences of the wzfh-pattern, the affected entity is ani¬ 
mate, as is the case in examples (4) and (5). In example (6), the nation can also be 
regarded as animate since collective nouns are usually subsumed into animate gender 
classes as well (cf. Quirk et al. 1985:316). This strong association between the with- 
pattern and animate affected entities holds true for all five corpora at hand. No such 
association can be observed in the/or-pattern. 

What is more, the vWfh-pattern and the/or-pattern differ with regard to the order 
of elements. In the vWfh-pattern, the affected entity precedes the provided entity. In 
the for- pattern this order is reversed. Speakers thus choose one of the two patterns 
according to the principles of end-weight and/or end-focus. That is to say, if the 
affected entity represents the heavier constituent, the/or-pattern is used, as in exam¬ 
ples (7) to (9): 

(7) Should the government directly provide education for the children who 
want public education ? (brown J48 1950) 

(8) The special Hospitals Broadmoor, Rampton and Ashworth Hospitals 
provide care for 1,700 mentally abnormal people who are judged to... 

(bnc fyw 1096) 

(9) The lute also provided the music for the game of musical chairs they 
played... (bnc G3M 963) 

If, on the other hand, the provided entity is the heavier constituent, it is the with- 
pattern that tends to be used. Examples (10) to (12) illustrate this correlation: 

(10) It also conveniently provided me with straight edged divisions of the 
remaining space, (bnc CN4 417) 

(11) .. .the New Age provides seekers with a spiritual core around which they 
can orbit... (flob D15 20) 

(12) ... and provide the Americans with bases from which nuclear weapons can 
be used, (lob B23 19) 

In a similar vein, speakers may choose between the two patterns in order to place 
the rhematic information in end-focus position 6 . In examples (13) and (14), the/or- 
pattern is used because the affected entity is in focus. 

(13) A white cow used to provide milk for everyone in the locality... 

(bnc bmt 420) 
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(14) To solve the elder-care problem, he would provide “choices” for old people 
who still have a lot of money, (frown A14 158-59) 

In examples (15) and (16), the wzfh-pattern is used because it is the provided entity 
that is to be focused. 

(15) . . .the EC must be strengthened to provide the world with a counter¬ 
weight to the USA. (flob F17 168) 

(16) . . .to compensate drivers for any apparent risks in trucking. In addition, 
it is quite possible that firms provided the drivers with greater safety 
resources... (frown J41104-106) 

The third pattern of provide to be mentioned is the fo-pattern. This pattern dis¬ 
plays the same order of elements as the/or-pattern. However, as can be seen in Table 1, 
the fo-pattern is used significantly less frequently than the/or-pattern. This is the case 
because the fo-pattern tends to co-occur with a restricted range of provided entities. 
Consider examples (17) to (19): 

(17) .. .thus providing a more effective challenge to independent services. 

(flob G76 196) 

(18) .. .it provides the only realistic solution to the problems of race relations... 
(lob D17 84) 

(19) . . .governments are able to provide local subsidy to local firms or individu¬ 
als... (frown H05 153) 

The corpus analysis shows that the fo-pattern occurs with nouns (as provided enti¬ 
ties) which are usually followed by the preposition to. That is to say, these nouns— 
such as challenge, solution and subsidy —have a pattern themselves which could be 
described as ‘N to n. The list of some nouns with this pattern given in Table 2 is based 
on the pattern information indicated in the Collins cobuild English Dictionary. 

To sum up, the analysis of large amounts of data makes it possible to explain in 
functional terms the choice of one of the three patterns of provide under discussion. 
In the light of the corpus data, the following principles and factors turn out to be 
relevant: the principles of end-weight and end-focus in general and lexicosemantic 
restrictions on the wzfh-pattern and the fo-pattern in particular. Note also that corpus 
evidence may help explain why the/or-pattern is more frequent than the other pat¬ 
terns in all corpora: it is neither restricted in terms of the animacy of the affected 
entity nor in terms of the noun in n 2 -position and the preposition that usually follows 
it. The/or-pattern is therefore very flexible and can be used with virtually all affected 
and provided entities. It is, as it were, the default case of pattern selection. Generally 
speaking, the aforementioned principles and factors derived from corpus evidence 
appear to be part of speakers’ linguistic knowledge since they lead language users to 
prefer a specific pattern to others in given contexts. 
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pattern of provide: V - n 2 - fo - ^ 

examples of nouns occurring in n 2 -position 

nouns in n 2 -position generally 
occur in the pattern ‘N to n 

aid, assistance, answer, boost, care, challenge, 
contribution, grant, help, impetus, incentive, 
information, input, protection, sanctuary, 
service, solace, solution, stimulus, subsidy, 
support, treatment, value 


Table 2. The restricted lexis in the n-position of the to-pattern of provide. 



Figure 2. Including corpus evidence in the constructional and lexical network of 
provide. 

Thus, corpus evidence should be included in the constructional and lexical net¬ 
work given in Figure 1. Figure 2 offers a refined version of Figure 1 in that it includes 
the preference for the/or-pattern in general and the principles and factors which are 
relevant to the actual use and choice of each pattern in particular. The graphic visu¬ 
alisation approximates to a usage-based model in which the framework of cognitive 
grammar is complemented with corpus data. 

4. some concluding remarks. In the present paper, I hope to have shown that a 
cognitive grammar of speakers’ linguistic knowledge can be fruitfully combined with 
corpus evidence, resulting in a model that is much more usage-based and, thus, much 
more plausible. In the from-corpus-to-cognition approach as described and exempli¬ 
fied above, the scope of corpus evidence is about to widen. In my view, corpus data 
not only tell us important things about language use, but also about the underlying 
language system as represented in speakers’ linguistic knowledge. 

Secondly, it goes without saying that no corpus—however large it may be—will ever 
cover all the structural possibilities in lexicogrammar. It is here that intuitive data such 
as grammaticality and acceptability judgements of native speakers will continue to play 
a major role in any usage-based model of speakers’ linguistic knowledge. As J. Aarts 
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(1991:46-52) and others have noted, the observation of corpus data on the one hand 
and intuition-based methods on the other are not mutually exclusive: corpora show us 
probabilities in a language whereas intuition may tell us more about what is possible in 
a language. The most comprehensive picture of language, covering linguistic routine as 
well as creativity, might indeed be achieved by using both corpora and intutitive data. 
As things stand, any plausible model of speakers’ linguistic knowledge should address 
both questions involved: (1) what is structurally possible; (2) what is likely to occur and 
why? Future research into the linguistic system and its cognitive basis should thus take 
into account corpus data to a much larger extent than in the past. 


1 In the present paper, the term ‘corpus’ is exclusively used for ‘a collection of texts assumed 
to be representative of a given language, dialect, or other subset of a language, to be used for 
linguistic analysis’ (Francis 1982:7). In my view, one should abstain from regarding any col¬ 
lection of texts (let alone decontextualised examples) as a corpus (as has recently come into 
vogue). The linguistic corpus is both large and, as a statistically reliable sample, representa¬ 
tive of more than it actually is. In fact, it is this representativeness in corpus design which 
makes it possible to extrapolate general trends in language use from corpus findings. 

1 It goes without saying that other kinds of patterns of usage have been described as well. 
In particular, Biber s (1988) corpus-based description of different linguistic preferences in 
different genres and his concept of linguistically defined text-types, cutting across tradi¬ 
tionally established and intuitively defined genres, should not go unmentioned. 

* In the present paper, I will discuss the corpus data only summarily since my focus is on 
the scope of corpus evidence from a theoretical point of view. Both a detailed statistical 
analysis and an elaborated functional interpretation of the data are available elsewhere (cf. 
Mukherjee: in press). 

4 In the patterns, ‘V’ stands for the verb, ‘n ’ for the affected entity and ‘n 2 ’ for the provided 
entity. Note that I am not concerned with the passive equivalents of the patterns because 
the optionality of the foy-agent (and, in fact, its frequent omission) requires a much more 
detailed discussion of the relevant principles and genre-specific preferences, which is 
beyond the scope of this paper. 

1 As a matter of fact, example (2) is a good candidate for multiple analysis. Specifically, one 
could argue that training in example (2) is not n , but a modification within n 2 ( assistance 
and funds for training). The same objection would hold true for solution to the problems 
in example (18). This view, however, is clearly based on a particular (but, I think, not the 
only possible) analysis of the hierachical relations between the constituents at hand and 
could thus be visualised by way of, say, bracketing (e.g. by representing the structure of 
example (2) as V - [n 2 - [for - nj]). Conversely, my analysis in terms of patterns has to be 
seen within the framework of the pattern grammar approach, which is intended to make 
do without any hierarchical information, i.e. without a genuinely structural analysis (cf. 
Hunston and Francis 2000:152). The chief rationale behind this solely pattern-based ana¬ 
lysis, mainly inspired by Brazil’s (1995) Grammar of Speech, is the view that a structural 
analysis is only possible once the sentence as a product is finished, whereas the pattern 
grammar is an account of speakers’ online production of sentences as a process of ‘pro- 
spections’. Thus, it does not come as too much a surprise that clauses which might be ana¬ 
lysed differently when it comes to hierarchical constituent structure, as in examples (1) 
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and (2), are subsumed into the same pattern by advocates of pattern grammar, to which I 
also subscribe. I would like to thank Ruth Brend for a discussion of this theoretical issue in 
general and for pointing out to me some potential problems involved in a purely pattern- 
based analysis in particular. 

While the heaviness of a constituent is a clause-internal (and, thus, easily accessible) fea¬ 
ture, it is quite clear that the givenness and newness of constituents can only be described 
by referring to the context. It is here, that is in exploring ‘language as function in context’ 
(Tognini-Bonelli 2001:4), that the analysis of corpora has a great advantage over the use of 
decontextualised (and/or invented) sentences and artificial laboratory data. 
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MEASURING UP TO EXPECTATIONS: WHAT CONSTITUTES 
EVIDENCE IN CHILD LANGUAGE RESEARCH 


Suzanne Quay 

International Christian University, Tokyo 


Child language research has come a long way since the early twentieth century, 
when researchers interested in child language development conducted longitudinal 
case studies using diary records (for example, the classic studies by Ronjat 1913 
and Leopold 1939). When Bennett-Kastor (1985/86) examined 154 studies published 
between 1970 and 1985 in the area of child language development, she discovered 
a division in methodological practices between linguists and those trained in the 
humanities, and psychologists and those trained in the social sciences. In the studies 
she examined, linguists had a greater preference for naturalistic data based on a small 
number of subjects and moreover, tended to investigate production. Psychologists 
had a greater preference for experimental data involving large numbers of subjects 
and were more likely to investigate comprehension or response. Linguists were found 
to use audiotaping more often than did psychologists with videotaping, diary records 
and other stimulus materials used less often than expected. Both groups favored 
studying children aged between three and five with linguists showing a slight shift 
towards using subjects aged between birth and two years, and psychologists showing 
a slight shift towards using subjects over the age of ten. 

Child language research was thus described by Bennett-Kastor (1985/86) as a dis- 
unified field, more so at the level of data and method than in theory, because of the 
different backgrounds of its practitioners. Nevertheless, the field has changed with 
joint efforts from linguists and psychologists. Work on the Child Language Data 
Exchange System or childes (MacWhinney 1995), particularly in the last decade, has 
provided methodological guidance and tools for child language research with a data¬ 
base of transcripts for researchers worldwide, and programs for the computer analysis 
of transcripts. Advances in technology have also played a role in changing child lan¬ 
guage research. Digital recordings and wireless transmissions can now be used—a far 
cry from note-taking done by linguists and/or parents while listening to children. The 
question of interest here is how effective past and present methods are in measuring 
early child language development. 

1. a study with several methods of data collection. This paper presents separate 
analyses of data collected using different methods from a longitudinal case study 1 that, 
considered separately, would create a misleading picture of the child’s actual linguis¬ 
tic behavior but which are more informative when considered together. The results 
indicate, as we will see, that the use of composite, rather than individual measurement 
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Method 

Frequency of 
collection 

Information gathered 

Summary of results 
in percentages 

Question¬ 

naires 

• completed by parent(s) 
at ages i;o, 155 and 159 

• family background 

• amount of exposure to 
each language 

Amount of language 
input from o;n to 
i;9—heard average % 
of time: 

English 46% 
Japanese 36% 
German 18% 

Interviews 

• conducted at the home 
and in the daycare at 
ages i;o, 155 and i;9 

• child’s general 
behavior 

• understanding and 
use of particular 
words and gestures 


Diaries 

• mother: each month 

for words understood 
& every 2 weeks for 
words produced 

• daycare staff: every day 
Freddy was in atten¬ 
dance 

• new words under¬ 
stood and produced 
by child in the home 
and at the daycare 

Comprehension up 

to 1:4: 

English 63% 
German 35% 
Japanese 2% 

MCDI 

• Infant & Toddler Short 
Forms in English & 
German: every 3-4 
weeks 

• full Japanese mcdi at 
ages i;o, 155, and 159 

• common/typical 
words understood 
and produced by Eng¬ 
lish-, German- and 
Japanese-speaking 
children 

Production up to i;io: 
English 57% 
Japanese 34% 
German 9% 

Video 

recordings 

• each week in the home: 
once in an English 
context and once in a 

German one 

• each week in the Japa¬ 
nese daycare 

• actual linguistic pro¬ 
duction in three sepa¬ 
rate language contexts 

Language choice in 
lexical production— 
Conflated across the 
two language contexts: 
Japanese 78% 
German 14% 
English 8% 


Table 1. Freddy’s language development according to each method used. 

formats is a desideratum in any child language acquisition study. The child in question, 
a male infant aged between o;n and i;io (year;month.day), was too young to be studied 
using testing and experimental methods favored by psychologists. Thus only naturalistic 
observation methods were used and their effectiveness discussed as we seek to answer 
whether each method and the results measure up to our expectations. 












MEASURING UP TO EXPECTATIONS 


117 


2 . METHOD. 

2.1. the trilingual family. The subject of this study, Freddy, was born on April 24th, 
1997 in Tokyo, Japan to a German father and an American mother. From birth, Freddy 
heard German from his father, a landscape architect with a Masters in Engineering, 
and English from his mother, a university professor with a doctorate in Sociology. 
Freddy’s parents spoke primarily German to each other and to the child when all 
three were alone together. Both parents were also competent in speaking Japa¬ 
nese, the main language of the local community. As appropriate, they would speak 
Japanese with Japanese interlocutors and English with English ones. 

2.2. the daycare environment. Freddy was the only non-Japanese child at the day¬ 
care that he attended from age o;n onwards for six hours each weekday. The eight 
other children and all the daycare staff were monolingual Japanese speakers. Thus 
Freddy was exposed to Japanese mainly at the daycare and to English and German 
mainly in the home. 

2.3. data collection. Data were collected in the home and at the daycare through 
questionnaires, interviews, diary records, MacArthur Development Inventories (here¬ 
after, mcdi), and video recordings as listed in the first column of Table 1. The fre¬ 
quency of data collection and the information gathered from each method are also 
summarized in the second and third columns respectively of Table 1. 

The parental questionnaire was administered when the child was aged i;o, 155 
and i;9 to gather information about the social background of the family as already 
described and the changing language exposure patterns for the child. Additional 
questions about the child’s general behavior, understanding and use of particular 
words and gestures, knowledge of specific games, preferences for certain toys, and 
whether imaginative play occurred were asked in the interviews conducted with the 
parent(s) and daycare staff also at ages i;o, 155 and 159. 

The mother and daycare staff were asked to keep a diary of words Freddy under¬ 
stood as well as words produced. The mother’s diary noted her son’s production of 
lexical items in English, German and Japanese from ages o;io.25 to i;io.i. For the 
words her child understood, she listed them in her diary roughly once a month. 
The words he produced were listed on average every two weeks. Members of the 
daycare staff noted Freddy’s linguistic progress in a daily diary along with their usual 
entries about the child’s behavior and activities, which they provided daily for every 
child to inform parents about their child’s day. 

The mcdi, a type of parental report, was also used to assess Freddy’s understanding 
or understanding with production of common vocabulary items. Every three to four 
weeks, the mother completed the English version of the mcdi —the infant short form 
from ages i;o to i;4 and the toddler short form from ages i;4 to i;9 (details about the 
short form versions of the mcdi are in Fenson, Pethick and Cox 1994). The mother, 
without being asked, also indicated on the form when her son understood English 
words but produced the words understood in English in either German or Japanese 
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(the mcdi was of course designed to assess monolingual children and does not take 
into account the language variations that may occur in terms of understanding with 
production in multilingual children). Because a German version of the mcdi was not 
available, the German-speaking father was asked to consider, also every three to four 
weeks, whether the items on the English version of the parental report were applicable 
in a German language context (basically, translate the English items into German and 
report on whether the child understood or produced these words in German). The Jap¬ 
anese version was a full mcdi form adapted into Japanese by Ogura and her colleagues 
at Kobe University (Ogura 1998; Ogura, Yamashita and Murase 1998). The Japanese 
mcdi was used to interview various members of the daycare staff (more than one staff 
member and different volunteer caregivers were present each day) when Freddy was 
aged i;o, 15 and 159 to obtain an idea of Freddy’s receptive and expressive vocabulary in 
that language environment. It was not possible to ask any one daycare staff member to 
complete the checklist on a more regular basis than during the interviews. 

Video recordings were made every week (when possible) in the home, once with 
the mother addressing the child in English (considered to be an English language 
context situation) and once with the father addressing the child in German (consid¬ 
ered to be a German language context situation). Video recordings were also made 
each week in the daycare, a Japanese language environment. A Sony Handycam Hi8 
Video Camera Recorder (CCD-TRV85 ntsc) was used in all three separate language 
contexts. One video camera was left with the parents to use on a tripod, as this cre¬ 
ated an environment whereby the child could in fact draw specifically on his knowl¬ 
edge from one or the other of his two home languages in interactions with either 
his English-speaking mother or with his German-speaking father without interfer¬ 
ence from a camera operator who could possibly affect the child’s language choice. A 
second video camera was used by the Japanese-speaking research assistant who was 
sent into the daycare each week to videotape Freddy playing at the daycare or at dif¬ 
ferent parks near the daycare. Altogether, thirty sessions were recorded in the Japa¬ 
nese daycare, twenty-nine sessions with the mother speaking English and eighteen 
sessions with the father speaking German. 

2.4. transcription and coding. Full transcripts in the chat format of childes 
(MacWhinney 1995; see also Sokolov and Snow 1993) are still being made for thirty 
minutes of video-recorded sessions in the German, English, and Japanese lan¬ 
guage contexts (please note that all reference to contexts is based only on the main 
language used by the child’s interlocutors and does not imply that the infant is aware 
of having the three languages). Due to the on-going nature of this time-consuming 
task, this paper reports on a small part of the data, specifically on six pairs of video 
recordings made in the home. His mother addressed him in English in six English 
context sessions and his father addressed him in German in six German context ses¬ 
sions, spaced at approximately six- to eight-week intervals between ages i;i and i;io. 
Further details about these recordings can be found in Quay (2001) 2 . 
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In order to investigate the child’s language choice patterns, it was necessary to 
code his utterances as English, German, or Japanese. The identification of Freddy’s 
early utterances was not an easy task. The phonetic transcriptions of the child’s utter¬ 
ances in phonascii (MacWhinney 1995: 82-85) along with information on the child’s 
actions and the adult’s responses in the transcripts aided in the coding of utterances. 
In most cases, coding relied on the phonetic resemblance of the utterance to an adult 
source word in English, German or Japanese, but phonetically simple versions of 
adult source words were also accepted. For utterances that clearly resembled adult 
source words, the child’s reference or use also had to be similar or match the adult’s or 
the child’s own use in other situations. Coding was made easier when previous occur¬ 
rences of the particular utterance had already been noted in other sources (diaries, 
mcdis, interviews), thus providing further evidence of the child’s consistent use of a 
word for a particular meaning. Any utterances that were ambiguous between any two 
or all three languages were excluded from further analyses. 

3. results. The results reported here have been simplified into percentages, as 
shown in the last column of Table 1, for the main purpose of comparing the results 
with the methods used. More detailed results are presented in figures and charts in 
Quay (2001). 

3.1. LANGUAGE EXPOSURE AS EXTRAPOLATED FROM THE QUESTIONNAIRE. The ques¬ 
tionnaire filled out by Freddy’s mother provides the information about his language 
exposure. Freddy spent his first twenty-two months of life in Tokyo, with a two-week 
visit to his relatives in Germany when he was only two and a half months old. During 
a six-week period in the summer of his second year (between ages 1:2.24 and 1:4.4), 
the family spent four weeks in the United States and two weeks in Germany. 

The ‘questionnaire’ row of the last column of Table 1 shows that on the average 
between ages o;ii to 1:9, Freddy heard English 46% of the time, Japanese 36% of the 
time and German only 18% of the time. Freddy’s exposure to German is the lowest 
due to the fact that his father had a busy work schedule and was absent for part of the 
time when Freddy was aged between 155 and 1:9. 

3.2. RESULTS FROM PARENTAL AND DAYCARE REPORTS IN INTERVIEWS, DIARIES AND 

mcdi. The data from the mcdi, the diaries and the interviews were combined (as 
indicated by the shading for these three data sources on Table 1) to determine the 
composition of Freddy’s early lexicon in terms of comprehension up to age 1:4 and 
production up to age 1:9 in three languages. 

In terms of vocabulary comprehension, Freddy appears to understand more English 
words than German or Japanese ones from ages i;o to 1:4. By age 1:4, 63% (N=52) of 
the words Freddy could understand were English, 35% (N=29) were German words and 
only 2% (N=2) were Japanese ones (cf. the ‘diaries’ row of the last column of Table 1). 

In terms of words Freddy could produce up to age 1:9 (based again on the mcdi, 
diaries and interviews), 57% (N=5o) of them were English words, 34% (N=3o) were 
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Japanese words and 9% (N=8) were German words (cf. the ‘mcdi’ row of the last 
column of Table 1). At age 159, he had a total of 88 different vocabulary types in his 
lexicon from three language sources. 

3.3 results from video recordings. The most unexpected results occurred in the 
video recordings when Freddy produced mainly Japanese utterances in both the Eng¬ 
lish- and the German-language contexts. Based on the data obtained from the mcdi, 
diaries and interviews, we would have expected Freddy to produce more English 
words than German or Japanese ones. Contrary to our expectations, in both language 
contexts, Freddy consistently produces more Japanese tokens. 

The data are conflated across the two language contexts, as shown in the ‘video 
recordings’ row of the last column of Table 1. Of all the data that could be identified 
as having a language-specific source, 78% (N=209) are Japanese utterances (in terms 
of tokens). German was produced slightly more often than English as such utterances 
made up 14% (N=37) of his total language-identifiable production. Only 8% (N=2i) 
are English utterances, in spite of the fact that the mcdi and diaries report that Freddy 
produced the most English utterances by age 159. 

4. discussion. The results from each individual method tell a different story about 
Freddy’s language development when viewed separately. According to parental and 
daycare reports in the form of interviews, diaries and mcdi, Freddy is the most pro¬ 
ficient in English for both comprehension and production. As for his other two lan¬ 
guages, the comprehension data suggest he understands more German than Japanese 
at least up to age 154 but the production data up to age 159 show that he demonstrates 
more Japanese spoken ability than German. The most surprising results come from 
the video recordings with evidence indicating that Freddy speaks mainly Japanese 
even when his mother speaks to him in English and his father speaks to him in 
German. Elsewhere in Quay (2001), Freddy’s overwhelming use of Japanese in the 
video recordings has been explained as being due in part to having accommodating 
trilingual parents, in part to strong peer and community influence, and in part to per¬ 
sonality and sociopsychological factors. Given these results, what can we say are the 
limitations and strengths of the methods used. 

The questionnaire, interviews, and diaries are different forms of parental reports 
that have been criticized for their reliability and objectivity. It is often felt that 
parental reports are selective, may not provide enough details to reflect changes in 
development or may highlight idiosyncratic forms (cf. Bennett-Kastor 1988: 60-61; 
a discussion of disadvantages and limitations of parental reports can also be found 
in Berglund 1999). Some of the problems with such parental reports have been dealt 
with by the development of the mcdi, which, while also being a parental report, uses 
standardized vocabulary checklists so that parents are relying on their recognition 
memory rather than their recall memory when they report on children’s present 
rather than past behavior. Since the instruments are checklists that focus on children’s 
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current communicative skills (Bates, Bretherton, and Snyder 1988; Fenson, Pethick, 
and Cox 1994), they are felt to increase the validity of parental reports. 

Besides the added advantage of instant playback, video recordings are considered 
to be more neutral and give information that the mcdi cannot provide on pragmatic 
skills, parental communication, and the nature of interaction and non-verbal com¬ 
munication beyond simple gestures. Video recordings are considered to leave less 
room than parental reports for over-generalization, error and bias. Genesee (1989) has 
warned that parental information cannot be totally reliable when it concerns parents’ 
own speech. Research by Goodz (1989) and Kasuya (1998) on bilingual families have 
found that parents claiming to use only the one parent-one language’ approach or to use 
mainly one language did not model such speech, thus showing a discrepancy between 
reported language use and actual production. This, of course, does not preclude parents 
as the best sources for estimates about their children’s early exposure patterns and for 
descriptions of sociolinguistic background (De Houwer 1995:224-25 and Kasuya 1998: 
331 also defend the usefulness of certain types of parental information). 

Video recordings, however, also have limitations, as they provide a particular sam¬ 
pling that may not be typical of language production during the rest of a day when no 
recording is made. The activities recorded in the home where useful language samples 
could be obtained were of the child playing with his toys or looking at books. Record¬ 
ings have not been made, for instance, during the daily diaper-changing event, but his 
mother reported in the diary that the English word, down, was produced at age i;5-i5 
when Freddy wanted to be lifted down from his changing table. Similarly, up-down 
was used at age i; 8.13 in the morning when he wanted his mother to get up and take 
him downstairs. Such utterances never appear in the video recordings because they 
are not appropriate to the situations or activities being recorded. From the English 
version of the mcdi infant short form completed for Freddy at age 153 by the mother, 
we find that he understands the item night night, but actually says ‘ne ne’ (nenne is 
the Japanese baby word for‘sleep’). He understands the items finish and all gone on the 
mcdi but actually produces ‘all done’ for both concepts. None of these items appears 
on the video recordings analyzed because they are not needed during the play activi¬ 
ties recorded. However, these examples from the diary and mcdi indicate that paren¬ 
tal reports can complement the particular sampling limitations of video recordings, 
especially for constructing a lexicon of the child’s vocabulary in three languages. 

5. conclusions. Different methods in combination contribute to the strengths of 
this case study. Evidence from parental and daycare reports in interviews, diaries 
and mcdi serve as an important back-up system to video recordings for three over¬ 
lapping reasons: 

(1) to balance subjective and objective elements, 

(2) to supply more comprehensive detail, and 

(3) to identify multiple contexts. 
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In terms of (1), while videotaping is considered to be more objective than parental 
reports, a full understanding of the child’s abilities cannot be captured on video, as 
video recordings tend to be limited in frequency, duration and longevity In other 
words, taping does not usually occur seven days a week, during the child’s every 
waking moment and for indefinite periods. In this study, the mother and father 
observed the child at home while daycare staff observed the child in the daycare. The 
researcher visited both the home and the daycare settings and in the latter, research 
assistants were also present, making notes about the child’s linguistic behavior. 
Although observers are felt to be less objective than mechanical recording equip¬ 
ment, having many observers, as in this study, increases consensual validity. In terms 
of (2), the researcher can be hindered by production data, which are limited only to 
what the subject happens to say during the sampling period and its situation. It would 
be difficult to examine the child’s full production capabilities, for example in terms of 
his lexicon, using only video data that record one type of event, such as play sessions 
between the child and his parents. In terms of (3), contexts are deemed to be the ele¬ 
ments that either directly or indirectly affect the development of the child’s language 
as in different physical settings, behavioral and linguistic environments, and inter¬ 
actional variables. Using video recordings alone tends to show only the particular 
context or situation filmed and not other segments of the child’s daily life. 

In spite of the fact that the results from the video recordings differ from those 
obtained from the mcdi, diaries, and interviews, the systematic analyses of data col¬ 
lected through a combination of methods can allow reliable inferences about the 
child’s language knowledge to be made. Ideally, converging evidence across multiple 
methods would be the most powerful approach to hypothesis testing corroboration 
regarding what does and does not occur in the linguistic competence of a child. When 
converging evidence is not available across multiple methods, at least new issues can 
be raised and researchers have a more comprehensive database from which to draw 
inferences and explain anomalous results. Caution is thus advised in the interpreta¬ 
tion of results in studies that depend on just one method to measure children’s com¬ 
municative abilities. 


I gratefully acknowledge a grant from the Matsushita International Foundation and the 
time, generous cooperation and active participation of Freddy’s parents, the daycare staff 
and Freddy himself on this project. I am grateful to Ayako Inoue, Noriko Tamura and 
Junko Ogawa who took turns video recording Freddy in the daycare and also to the 
following research assistants: Ayako Kuwabara, Kyoko Okamura, Natsumi Sakurai, Yoko 
Sato, Bettina Shimazu, Yuki Takai, and Sayaka Yoshida. 

This paper refers to further details in Quay (2001) but differs from that article as the sum¬ 
mary and presentation of results in terms of percentages as shown in Table 1 have not 
appeared elsewhere before. This paper, unlike Quay (2001) which describes the role of 
input in early trilingual development, focuses on the overall results from each method 
for the express purpose of showing that individual measures of child language use and 
comprehension are misleading unless put in relation to one another. 
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A PLAUSIBLE CONTRADICTION 


William J. Sullivan 

Uniwersytet Wroclawski and University of Florida 


our conference theme, what constitutes linguistic data, has an implicit codicil: (data) 
from which a description may be built. Phonetic data do not suffice for a sociolinguis- 
tic study, and detailed social parameters are irrelevant to the phonetic phenomenon of, 
say, voicing. Yet under the proper circumstances, both types of data fit Yngve’s (1996 
and elsewhere) repeated insistence that a scientific linguistics focus on things given 
in advance. Such objectively measurable properties of communicating human beings, 
unlike unsupported native speaker intuition, are the only things we can depend on. 
Still, recorded observations must be interpreted and arranged with logical coherence. 
Kopernik and Ptolemy depended on essentially the same observations. The difference 
in their descriptions, however, results from a difference in a (then non-empirical) pos¬ 
tulate, namely in whether the solar system is helio- or geocentric. So while Yngve’s rejec¬ 
tion of grammaticality judgements as data is well-founded, I am equally convinced that 
reliance on hard data does not guarantee the same description 1 . 

A perfect example of two divergent descriptions of hard phonetic data exists in 
the case of voice assimilation in obstruent clusters in contemporary standard Russian 
(CSR). Two phonetic facts have long been recognized: CSR obstruent clusters are 
entirely voiced or entirely unvoiced and the voice nature of the final obstruent deter¬ 
mines the voice nature of the entire cluster. These two observations are accepted in 
all descriptions of CSR obstruent phonology in syllable onset position 2 . The problem 
is that two radically different descriptions have emerged. One, represented by Jakob- 
son, Cherry and Halle 1953 (hereafter JCH) treats voiced obstruents as marked 3 . The 
other, represented by Lamb 1966 and 1977, treats unvoiced obstruents as marked. 
The postulates are apparently contradictory. Yet both descriptions seem to account 
for the phonetic facts of onset obstruent clusters in CSR. It would be hard to criti¬ 
cize Yngve if he identified this as the result of introducing a philosophical concept 
(markedness) into a scientific enterprise 4 . 

Yet I am convinced that something else is going on. In fact, no complete descrip¬ 
tion of the system that underlies both phonetic and neurological facts exists. There¬ 
fore, I give a representative sample of the phonetic facts and a comparison of the 
two descriptions. I tell why each approach is plausible and show the locus of the con¬ 
tradiction. By providing a description of the logic underlying the phonetic facts of 
onset clusters, I show that there is no logical contradiction and that both Jakobson 
and Lamb are correct. 
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It/ vs. Id / 5 

[d] only 

[t] only 

ofobrat’ ‘select, prf’ 

podobrat’ ‘choose, prf’ 
ofojti ‘withdraw, prf’ 

podojti ‘approach, prf’ 

kofa ‘cat, AGsg’ 

koda code, Gsg’ 

ofbiraf ‘_, impf ’ 

podbiraf ‘_, impf’ 

ofxodit’ ‘_, impf’ 

podxodit’ ‘_, impf’ 

kof ‘_, Nsg’ 

kod ‘_, Nsg’ 


Table 1. Voice with obstruents in CSR. 


1. phonetic facts and phonological descriptions. Scientific generalizations 
(and hence, descriptions) are inductive, Bloomfield said, and he was right. The 
foundation is observed fact, interpreted as general statements on which formalized 
descriptions are based 6 .1 begin with the forms in Table 1, focusing on the elements 
represented by t and d in bold italics. 

Now the facts concerning these forms are undisputed: column 1 has [t] and [d], 
which realize phonemes It/ and /d/, respectively. Column 2 has [db] clusters. Column 
3 shows [tx] and word-final [t]. I henceforth ignore word-final [t]. 

Contemporary phonological descriptions all owe a debt to Trubetzkoy 1939. Tru¬ 
betzkoy describes the system underlying Table 1 as follows: [t] and [d] in column 1 
exhibit a (characteristic) privative contrast for voice; columns 2 and 3 show archipho- 
nemes of neutralization, with voiced D as the unmarked variant in column 2 and 
unvoiced T as the unmarked variant in column 3. 

JCH basically agrees with Trubetzkoy’s analysis, but they use the feature + voice 
and treat [-vce] as the unmarked variant everywhere. In their description the marked 
[+vce] is supplied in column 2, the unmarked [-vce] appears in all column 3 envi¬ 
ronments. Halle 1959 follows JCH but eliminates contrast and supplies the [+vce] or 
[-vce] in columns 2 and 3 by an alpha-switching rule. Chomskyan phonologists have 
followed his approach since then. Sullivan 1974 is a relational network (RN) descrip¬ 
tion. It parallels JCH 1953 without the [-vce] feature. The symbol ‘Y’ represents pho¬ 
nemic voice (cf. Table 2a). 

Contrary to all of these are Tamb 1966 and 1977. Lamb’s approach is also a RN 
one. As such, it parallels Sullivan 1974, except that unvoiced obstruents are treated as 
marked. The symbol ‘h’ is used to represent phonemic unvoicing (cf. Table 2b). I use 
Sullivan 1974 as the representative of the Jakobson approach herein. Because both 
Sullivan and Lamb use RNs they are directly comparable. The two descriptions are 
summarized in Table 2. 

Line 1 represents the input from the morphology to the phonology. In both 
descriptions, there are classes of morphemes with sounds potentially realized as [t], 
[d], [b], and [x]. It is up to the phonology to determine when each realization occurs. 
In the line 2, Table 2A relates d and b to TY and PY (i.e., voiced T and P), respectively. 
The t and x are related simply to T and x. Table 2B is exactly parallel, except for one 
reversal. The t and x are related to Dh and yH (i.e., UNvoiced D and y), respectively, 
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t 

d 

tb db 

tx dx 

1 

t 

d 

tb db 

tx dx 

T 

TY 

TPY TYPY 

Tx TYx 

2 

Dh 

D 

DhB DB 

Dhyh Dyh 



\ / 

\ / 




\ / 

\ / 

T 

TY 

TPY 

Tx 

3 

Dh 

D 

DB 

Dyh 

Cl 

Cl Y 

Cl Cl Y 

Cl Sp 

4 

Cl h 

Cl 

Cl Cl 

Cl Sp h 

Ap 

Ap 

Ap Lb 

Ap Do 


Ap 

Ap 

Ap Lb 

Ap Do 



Y 


5 

h_ 



h 

Cl 

Cl 

Cl Cl 

Cl Sp 


Cl 

Cl 

Cl Cl 

Cl Sp 

Ap 

Ap 

ApLb 

ApDo 


Ap 

Ap 

ApLb 

ApDo 


A. Sullivan 1974 B. Lamb 1966/1977 

Table 2. Two descriptions of Russian obstruent clusters. 

and d and b are related simply to D and B. Line three represents what the phonotactics 
(PT) accepts. In both Table 2A and 2B the obstruents are accepted in their morphemic 
order, as this fits the general cluster structure of Russian (cf. Figure 2). Any cluster- 
final occurrence of Y (in Table 2a) or of h (in Table 2b) is realized in that position, but 
any non-final Y or h is unacceptable and is not realized. In both descriptions, the PT 
outputs phonemic features in PT order. Note that from line 3 on, there is no distinc¬ 
tion between tb and db columns or between tx and dx columns in either Table 2A or 
Table 2B. This is the biunique level of phonemic contrast rejected in Halle 1959. The 
important thing in both RN descriptions is that phonemic (UN)voice is located phys¬ 
ically at the end of the obstruent sequence. In both descriptions, phonemic (UN)voice 
is shifted to a position that dominates the entire obstruent sequence, whether there is 
one obstruent or more, in the hypophonotactics (HPT), given in line 5. That is, the 
relation of Y/h to obstruent phonemes is linear in the PT but hierarchical HPT. 

The descriptions in Table 2A and 2B look almost like mirror images. They both 
account for the phonetic facts. This makes the choice beween them almost arbitrary. 
But the difference derives from whether voice (Y, [+vce]) or UNvoice (h, [-vce]) is 
treated as marked 7 . But before looking at the consequences of this difference, I con¬ 
sider the logical nature and plausibility of the two descriptions. 

2 . PLAUSIBILITY, CONTRADICTION, AND THE LOGIC OF MARKEDNESS. Consider first 

the logic of markedness. The question of markedness only arises in the context of 
an asymmetric choice. A free choice is logically a simple or relation: [this or that]. 
The alternates are commutative and equivalent, i.e., [this or that] = [that or this]. In 
essence, the nature of equivalence here means that a description with the one does not 
differ in effect or relative simplicity from a description with the other. 

If this is marked a different situation arises. Under certain circumstances (c) 
we must take this and exclude that. Otherwise we cannot get this and that appears. 
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C 



this 


that that 
A 


this 


this 


that 


B 


Figure 1. A graphic representation of unmarked (A) and marked (B) choices. 

Logically, a marked choice is given by [[c implies [this or NOT - that ]] and that ]. 
Clearly no commutative equivalence is possible here. A graphic representation of 
unmarked choices is given in Figure ia, of a marked choice in Figure ib. 

We can only reverse a marked choice by turning the descriptive universe upside 
down, which is at least arbitrary and certainly sounds like a contradiction. That is, 
in the example of Table 1, by shifting from marked voice to marked UNvoice. Noth¬ 
ing in the phonetic facts requires such a shift. One possible source of justification for 
this shift could be in the relative plausibility of the two descriptions, what Bloomfield 
referred to as structural justification or compatibility with established portions of the 
description 8 . In fact, both descriptions are plausible, though for different reasons. 

There are two major reasons for the plausibility of treating voiced obstruents as 
marked. First, every voiced obstruent phoneme has an unvoiced counterpart. But 
three unvoiced phonemes, /c/, III, and Ixl, have no voiced counterparts. Thus, stating 
the relations of morphemes to phonemic features is marginally simpler if only voice 
(and not its lack) must be specified. Second, in the position of absolute neutralization 
(phrase-final) only unvoiced obstruents are found 9 . 

There are also two major reasons for the plausibility of a description with marked 
unvoiced obstruents. First, the normal mode of speech requires voicing. Thus the un¬ 
voiced sounds are in a distinct minority over all. Second, the motor cortex must send 
consecutive, overlapping signals to the lungs and vocal folds to produce the voicing. 
The ‘normal mode’ mentioned above means that during speech, these signals continue 
automatically unless they are cut off. This suggests a relationship from the linguistic 
system that signals the end of voicing at the appropriate time. Equally well, it could be 
a relationship that prevents an automatic signal from being sent. Either description 
fits the logic of the situation. Both favor marked unvoiced obstruents. 

Thus both descriptions are clearly plausible, but the plausibility arguments are just as 
clearly skew. I see no way to choose the better set on an empirical basis. Moreover, the 
second argument favoring marked unvoiced obstruents is not even phonetic. It is an infer¬ 
ence drawn from neurological evidence 10 . But this is the key to the whole situation. 

3. the complete description. The complete description is given in Figure 2. The 
relations from morphology are on the upper left, the PT next to them, and the hypo- 
phonotactics (HPT) on the right. Beginning at the upper left, note the lines labeled 
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Figure 2. The realization of voice in Russian. 
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t, d, b, p, and x. They relate to classes of morphemes and are part of a RN description 
of spell-out rules for morphemes. Thus, for example, t and d are related ‘upward’ to 
the classes of morphemes that are related ‘downward’ to apical stops. The d is related 
downward to two points. One is a neutralization with t as the archiphoneme T. The 
other is phonemic voice, labeled Y. Conversely, t is related only to the archiphoneme T. 
Parallel relationships hold for the other obstruents. Individual obstruents are related 
to obstruent classes: T and P to stops, characterized phonetically by a relation to oral 
closure (Cl); x to fricatives, characterized phonetically by a relation to spirant friction 
(Sp); and c and c to affricates, characterized phonetically in CSR by groove release 
(Gr). These phonological classes are related to the obstruent cluster (ObstCl: details 
unimportant for the present study but providing 1-3 obstruents in a row). If phonemic 
voice (Y) is present in final position of ObstCl, it takes precedence over lack of voice 
at A. This is so far almost identical to the description presented in Sullivan 1974. At 
this point there is a divergence: Y has no direct phonetic realization. If voice is not 
present, the right-hand branch of A is active and a signal is sent to encode the series 
of obstruents via the left-hand branch of B. Anything else that passes the HPT, vowels, 
sonants, or voiced stops, is encoded through the right-hand branch of B. Each time 
the loop on the right hand branch of B is active, another signal for voice is sent from 
the linguistic system, presumably to the motor cortex in a fully neurological model. 
Obstruents can be encoded via either branch of B, depending on the conditions in the 
PT. Everything else can only be encoded via the right-hand branch of B. This repre¬ 
sents the full logic underlying the phonetic facts concerning voice in CSR. 

Now note the critical nodes regarding voice markedness: A in the PT and B in 
the HPT. In A the left-hand (marked) branch is related to voice and the right-hand 
(unmarked) branch is related to a lack of voice. In B the situation is reversed. The left- 
hand (marked) branch of B is related to a lack of voice and the right-hand (unmarked) 
branch is related to voice. Thus voiced obstruents are in the marked relation in the pho- 
notactics, as in Sullivan 1974, which parallels JCH. The advantages of simplified speci¬ 
fication of phonemic features are retained. Similarly, unvoiced obstruents are in the 
marked relation in the HPT, as in Lamb 1966 and 1977. The advantages of automatic 
voicing for everything else and of repeated signals for voice are retained. The voice rela¬ 
tion is linear in the PT and hierarchical in the HPT, as in both Sullivan and Lamb. 

In short, with all the logic underlying all observed facts concerning ‘voice assimila¬ 
tion’ in Russian, the description is complete and the advantages of both approaches 
are preserved. No actual contradiction exists. 

4. Conclusions. The conclusions are clear. I list them without discussion. 

1. You must start from the observed facts—all of them—neurological as well 
as phonetic. (Inference: Lamb and Yngve are both correct.) 

Your description must be logically consistent throughout and must account 
for the logic of the system that underlies all of the observed facts. (Inference: 


2 . 



A PLAUSIBLE CONTRADICTION 


131 


Lamb and Jakobson/Sullivan were both correct, as far as they went, but none 
of them went all the way.) 

3. Actual contradictions are fatal but can only be exposed by explicit logic. 

4. Apparent contradictions can teach us something, as here, where both choices 
are plausible. 

5. Afterword. Trubetzkoy 1939 had an interesting view of privative relations and 
markedness. In his view column 2 in Table 1 shows unmarked voiced obstruents 
and column 3 shows unmarked unvoiced obstruents. Jakobson wanted voicing as 
a phoneme, rather than as just a component of a segmental phoneme, and he was 
correct in this. But Trubetzkoy was correct to insist that the choice of the marked 
member of the opposition may differ in different contexts, as nodes A and B in Figure 
2 show. We need to remember the work of these men, because it’s too much trouble to 
keep reinventing the wheel. 


I do not attribute such a belief to Yngve. 

Halle 1959, surely the best-known phonology of Russian to date, has some significant pho¬ 
netic gaps in his description of word-final obstruent clusters. Therefore I restrict myself 
to that portion of obstruent cluster phonology wherein the facts are complete and com¬ 
parable in all descriptions. 

Markedness, applied to linguistics by Jakobson and Trubetzkoy in the 30’s, is a categoriza¬ 
tion tool with a long and respected history in scientific classification. Before DNA map¬ 
ping, it was the major (or only) tool in biological classifications. 

I do not mean to make Yngve into a straw man here. Most of his criticisms of linguistic 
research are well-taken, and there is surely a contradiction of sorts here. 

I stipulate the need for a level of biunique contrast in phonological descriptions; archi- 
phonemic neutralization follows necessarily. 

The Coleman paper in the present volume discusses the varying usages of data vs. exam¬ 
ples in the linguistic literature. Without disagreeing with either the facts or the observa¬ 
tions in that paper, I would like to add an observation of my own: we exist in a society and 
have to speak in a way that lets us be understood, if we are heard. Sometimes this means 
using inapt or inappropriate metaphors, as I did in answering a question on the Polish 
linguist list not long ago. But even that doesn’t always work. Almost the only place I’m 
understood accurately is lacus. Thus I refer to forms (phonetic or graphic) and observa¬ 
tions or facts. 

A Chomskyan phonologist once told me that ‘playing games’ like that with markedness, 
which he assumed to be a universal, is ‘.. .almost heresy’. 

Again, Bloomfield was right about the general descriptive practice in science. Of course, 
the practice does not guarantee freedom from all possible error. 

Halle 1959 refers to the ease of stating rules for assimilation as well. But rules of his sort 
are an artifact of the descriptive model, so I ignore this argument. Moreover, the way these 
‘rules’ are described in the two RN descriptions (Table ic and ib) can be shown to be 
equivalent. The proof is easy but requires space, so I leave it as an exercise for the reader. 
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10 Bringing in neurological evidence used to be the surest way to start an argument in most 
linguistic gatherings. In fact, the denials of the relevance of neurological evidence were 
so heated and so categorical that it was generally much safer to avoid any appeals to neu¬ 
rology. Reference to neurological evidence (or its lack) still starts arguments in certain 
venues. 
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research in pragmatics has informed the fields of cross-cultural communication, 
interlanguage pragmatics and second language acquisition, and applied linguistics 
and second language teaching. Of particular interest has long been the study of 
speech acts, the way in which speakers of a language use that language to accom¬ 
plish communicative purposes, such as requesting goods and services, apologizing 
for infractions, extending compliments and expressing gratitude. Linguistic data for 
speech act research can be gathered in a variety of ways and can take various forms. 
Researchers can observe people use language in natural situations and tape-record 
what people say; in this approach unprompted language constitutes data. Alter¬ 
natively, researchers may construct discourse completion tasks (DCTs)—written 
scenarios designed to elicit from subjects in writing what they would say in given 
situations; such an approach yields language which has been elicited by deliberate 
prompts in a different medium. In contrast, researchers may engage subjects in role 
play situations in which they act out the targeted speech acts; researchers may also 
use introspective verbal report interviews which ask subjects to verbalize the mental 
processes they engage in when performing speech acts. Respectively for these last two 
approaches, data consist of prompted language and introspection about language use. 
This variety of data collection methods and data types raises questions about reliabil¬ 
ity: reliability of the data’s reflection of natural language use and reliability of results 
across research studies. 

Most speech act studies have relied on DCTs (Beebe & Cummings 1996; Kasper & 
Dahl 1991; Robinson 1991) as these are an efficient means for gathering a large amount 
of data quickly; however, there is evidence that speech act realizations elicited through 
DCTs do not accurately reflect what people would say if the data had been gathered 
naturally through observation. In other words, the reliability of DCTs has been called 
into question (Beebe & Cummings 1996; Hartford & Bardovi-Harlig 1992; Kasper & 
Dahl 1991; Rintell & Mitchell 1989) as they are likely to elicit what subjects think they 
would say. Studies comparing naturally occurring speech acts with speech act realiza¬ 
tions elicited through DCTs have until now focused on spoken language and written 
representations of spoken language in DCTs. In contrast, the present study focuses 
on a comparison of natural speech act production that occurs in a written medium 
and DCT elicitation: the purpose is to compare requests students make of professors 
via electronic mail with requests elicited through a DCT quantitatively, in terms of 
relative occurrence of levels of directness (Blum-Kulka, House & Kasper 1989), and 


134 


DONALD WEASENFORTH & SIGRUN BIESENBACH-LUCAS 


qualitatively, in terms of relative occurrence of syntactic forms. Findings can provide 
further evidence for the reliability or unreliability of elicited data collection in the 
investigation of speech act realization. 

1. BACKGROUND 

1 .1. NATURALISTIC DATA COLLECTION AND DCT ELICITED DATA COLLECTION. Two 

main methods of data collection have been used in research on speech act realization: 
ethnographic, naturalistic methods and elicitation methods using DCTs. Naturalis¬ 
tic methods yield spontaneous language data from speakers in natural, rather than 
contrived, situations and thus provide data reflective of speech act production as it 
occurs in the real world. Flowever, data collection may be very time-consuming, as 
the speech act under investigation may not occur naturally very often. Also, speak¬ 
ers’ backgrounds are difficult to control, and relevant information such as age, edu¬ 
cational background, and ethnicity may not be obtained at all (Beebe 1992). Also, 
if recording equipment is used, it may be intrusive and adversely affect speech act 
production—Labov’s (1972) Observer’s Paradox; if researchers rely on note-taking 
techniques, the accuracy of the data may be affected by the researcher’s selective 
memory (Beebe 1994). 

DCT elicitation methods collect data in artificial, contrived situations, prompting 
respondents to produce a targeted speech act in an imaginary situation that does not 
entail any real world consequences. Respondents read written scenarios of an oral 
communication situation for which they provide a written response. This method 
allows researchers to control respondents’ background and situational factors and 
to gather a large amount of data quickly; also, it provides insights into the norms 
perceived by native speakers and the structure of the targeted speech act (Beebe & 
Cummings 1996; Hartford & Bardovi-Harlig 1992). However, the linguistic realization 
of the speech act and the relative occurrence of forms may differ from actual use; due 
to the artificiality of the situation, respondents may give expected responses, includ¬ 
ing a response where no response would naturally occur or responses in forms which 
would not occur in an actual situation. Moreover, respondents can opt out of the task 
altogether, not providing a response at all (Bonikowska 1988). Further, DCT responses 
are highly dependent on stimuli/prompts (Roever 2000) that can easily be misinter¬ 
preted, thus producing an unintended speech act (Cohen 2001; Kasper & Dahl 1991). 
Finally, DCTs have been found to encourage fronting of information (Beebe & Cum¬ 
mings 1996; Hartford & Bardovi-Harlig 1992); the non-interactive nature of the elici¬ 
tation method tends to decrease the number of turns necessary to fulfill a language 
function, leading to a collapsing of information. 

1.2. comparison of data collection methods. In studies comparing natural 
speech act data with DCT elicited data, most researchers have found that results 
differ. DCTs yield shorter and less complex data than their naturalistic counterparts 
(Beebe & Cummings 1996; Bodman & Eisenstein 1988; Hartford & Bardovi-Harlig 
1992; Rintell & Mitchell 1989). Hartford and Bardovi-Harlig (1992:45) found that 
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DCTs restricted the range of semantic formulas used to realize the speech act (in 
that case, rejections) as well as the multiple turn negotiations typical of natural data, 
and they promoted the use of more direct requests with concomitantly fewer status¬ 
preserving strategies. They observed that‘the more difficult the situation is to negoti¬ 
ate in real-life the greater the difference between natural and elicited data,’ pointing to 
the influence of the artificiality of speech elicitation tasks. 

The data in these studies typically come from a relatively small sample of respon¬ 
dents, involve different speech acts, and represent different relationships between 
speakers; thus it is difficult to generalize from their results. In a comprehensive review 
of numerous speech act studies, Kasper and Dahl (1991) discuss the validity of various 
data collection methods with respect to how well these approximate natural speech 
act production. They confirm Wolfson, Marmor and Jones’ (1989) call for the collec¬ 
tion and analysis of naturally occurring data and urge more comparative studies of 
different elicitation techniques that take into account the entire speech event in which 
the speech act occurs. 

1.3. e-mail as collection method. Electronic mail data appear to overcome a 
number of the shortcomings of naturalistic data pointed out above: it is naturally 
occurring data that does not need to be recorded and transcribed; turns tend to be 
collapsed into one message, as writers need to address multiple aspects in one mes¬ 
sage in order to avoid lengthy, day-long exchanges; writers write in natural, rather 
than contrived, situations; observed language structures constitute natural language 
use; and data can be tailored to meet ethnographic standards as writers’ background 
and situational factors can be identified by the researcher. Until now, e-mail data has 
been used in few linguistic studies (Danet 1999; Murray 1986) and in only one prag¬ 
matics study (Hartford & Bardovi-Harlig 1996), which examines a small sample of 
student requests to faculty. Due to the limited research using e-mail data in pragmat¬ 
ics studies, generalization of findings may be limited. Electronic discourse may differ 
from spoken language, but e-mail offers new frontiers to the research of naturalistic 
speech act production. 

2. methods. Two sets of data were collected and compared in order to identify the pos¬ 
sible differences that might result from data collection methods. In both cases, the point 
of comparison was the requests from students to professors for permission to submit a 
course assignment later than the assigned due date, a highly face-threatening act. 

2.1. ethnographic data. A total of 83 e-mail messages from 64 students—28 native 
speakers (NSs) and 36 non-native speakers (NNSs)—to one of two American profes¬ 
sors provided the ethnographic data for the study. From these messages, 87 requests 
(44 from NSs and 43 from NNSs) were identified. All messages were unprompted and 
were the initial messages of an exchange of messages in the cases where a series of 
messages were involved. The naturally occurring e-mail messages used in this study 
are ethnographic (cf. Beebe & Cummings 1996) in the sense that the context of the 
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messages was well defined. The identities of the students and professors were known 
and aspects of the situation (e.g., time of request, due dates of assignments, types of 
assignments) were likewise identifiable. All messages were submitted via university- 
based webmail. 

2.2. elicited data. Responses to a discourse completion task (DCT) served as 
comparable elicited data. Fifty-seven students (30 NSs and 27 NNSs) were asked to 
respond to six DCT prompts (see Appendix) by handwriting requests for permis¬ 
sion to submit an assignment late under various situations, providing 342 responses. 
The situations had been identified as important contextual factors determining the 
acceptability of such requests (Weasenforth & Biesenbach-Lucas 2001) and included: 
1) requesting after the due date and without any attached work, 2) requesting before 
the due date, and 3) requesting after the due date with attached work. In all three 
situations, students were asked to write their requests under two conditions: having 
a good reason for the request, and not having a good reason for the request. Students 
were asked to complete the DCT, and an accompanying questionnaire, at their leisure 
outside of class and to return both forms to the researchers. A total of 325 requests 
(166 from NSs and 159 from NNSs) for late submission were included in the data for 
the present study. 

2.3. participants. The present study included 53 NSs and 62 NNSs. All NSs were 
students in a TESOL program at an American university; NNSs were either TESOL 
students (16) or advanced-level ESL students (46). All students were graduate stu¬ 
dents and all were highly proficient in English. One NNS student whose e-mail was 
analyzed also completed the DCT; five NS students were represented in the DCT and 
e-mail data. 

2.4. analysis. Student requests were analyzed for grammaticolexical indexicals of 
directness and categorized according to three levels of directness following the widely 
used framework for analysis of requests by Blum-Kulka, House and Kasper (1989). 
Only the head act—the most explicit form of the request—of each request was ana¬ 
lyzed. Supportive moves (e.g., apologies, justifications, commissives) were identified 
in the process but are not discussed in this paper. The three levels of directness used 
in this study are defined as follows: 

Level 1: Direct Requests are requests for which the illocutionary force is most trans¬ 
parent. They raise no doubt on the part of the reader about the import of the message; 
no interpretation is required. The grammaticolexical forms linked to directness in this 
study include the following: Performatives (e.g., I’m writing to request an extension of 
the due date); Questions (e.g.,7s there a chance of getting an extension?); Need statements 
(e.g., I need an extension); and Imperatives (e.g., Please consider an extension). 

Level 2: Conventionally Indirect Requests are requests which are modified 
syntactically or lexically so that the directness is somewhat veiled. These requests 
are realized through the use of: Preparatory condition using Could/Can/Would 
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you..., Could/May/Can I/we... (e.g., Could you give me three more days? Can I have 
an extension?); Hedged performatives (e.g., 7 would like to ask you to extend the due 
date); Preference statements (e.g., Id appreciate an extension. I’d like you to grant me 
an extension); Embedded constructions with complementizer/infinitive (e.g., 7 hope 
that you’ll give me more time. I hope to get the assignment done by tomorrow); and 
Embedded constructions (including conditionals), including 7 am/was wondering if 
it’s possible to get an extension; Logical conditionals (e.g., If you could give me more 
time, I would have a better paper); Acceptability conditionals (e.g., If it is OK, could I 
get an extension?), and other conditionals (e.g., 7 would be thankful if you could give 
me an extension). 

Level 3: Non-Conventionally Indirect Requests are requests that require inter¬ 
pretation, requests for which the illocutionary force is not transparent and could 
simply be understood as a statement: Hints (e.g., 7 have difficulties to submit my 
assignment on time). 

3. RESULTS 

3.1. variations in data access. The type of data collection method selected deter¬ 
mined the type and amount of data which was accessed. While the DCT was a much 
more efficient method of collecting a large set of data (cf. Beebe & Cummings 1996), 
it also constrained the data. Some respondents (11%) chose not to respond to all DCT 
prompts at all (cf. Bonikowska 1988) or provided responses other than the expected 
request type (cf. Cohen 2001). The DCT also forced a response when it would not 
otherwise be used. A number of students noted that they would not ask for an exten¬ 
sion under any circumstances, or not via e-mail; however, they did nevertheless in 
the DCT. On the other hand, the DCTs elicited useful information that would not 
have been available through ethnographic methods (Hartford & Bardovi-Harlig 
1992). Some respondents, for instance, indicated that they would make requests for 
late submission in person but would not do so by sending an e-mail message to a 
professor. All students also provided insight into what they considered good and bad 
reasons for asking for an extension. These types of information are useful in identify¬ 
ing aspects of the communication context which determine pragmatic variations and 
provide insight into different cultural expectations (Beebe & Cummings 1996). 

E-mail will also restrict the amount of data a researcher obtains. Some students 
reported that they would not ask for extensions of due dates in e-mail, although they 
would do so in person, because they felt that e-mail was an inappropriate medium. 
These requests would not be available to a researcher looking only at e-mail messages. 
On the other hand, e-mail may provide the distance needed for some students to 
make such face threatening requests (Daft, Lengel & Trevino 1987; Drake, Yuthas & 
Dillard 2000). 

Differences in the number of requests due to collection method are reflected in 
Figure 1, which provides the percentages of all requests according to the pragmatic 
situation. 
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Figure l. Requests by situation. (Situation l = requesting after the due date and 
without attached work, Situation 2 = requesting before the due date; Situation 3 = 
requesting after the due date and with attached work.) 


One of the most obvious differences across collection methods is the greater use 
of requests by NSs in e-mail (70%) as compared to the DCTs (41%) in Situation 2. For 
both NSs and NNSs, the proportion of Situation 3 requests increases in DCTs. Both 
differences may be due to the fact that the threat of asking for an extension, especially 
after the due date, is neutralized in the DCTs, since the students are not actually 
in that situation and face no real-life consequences (Beebe & Cummings 1996; 
Bodman & Eisenstein 1988; Robinson 1991). With the exception of NNSs in Situation 
1, it appears that the DCTs prompted requests which do not occur in actual e-mails, 
probably due to the face threatening and awkward nature of asking for an extension 
after the due date. From a cross-cultural perspective, it is interesting to note that the 
distributions of requests for NSs and NNSs are similar for the elicited data but very 
different for the e-mail data. The different profiles raise questions about the reliability 
of research results based on either set of data. 


3.2. VARIATIONS IN DISTRIBUTION OF REQUESTS. 

3.2.1. requests versus supportive moves. The same questions are raised in light 
of the distribution of messages with head act requests (in addition possibly to sup¬ 
portive moves) versus messages with supportive moves only. Figure 2 provides the 
percentages of all messages and DCT responses that included only supportive moves; 
that is, they did not include any form of request. E-mail messages and DCT responses 
for NSs and NNSs for each pragmatic situation are represented (e.g., NSi represents 
messages/responses for the NS participants for Situation 1). With only one exception, 
there are more supportive move-only e-mail messages than DCT responses for both 
NSs and NNSs. The variations in deployment of semantic formulae and the differ¬ 
ences across collection methods are most apparent in NNSs’ messages/responses 
for Situation 3 (cf. Hartford & Bardovi-Harlig 1992). The smaller occurrence of sup¬ 
portive-move only DCT responses may be due to the test-like and artificial nature of 
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Figure 2. Supporting move-only messages/response. 
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Figure 3. E-mail vs. DCT for Situation 1. 

DCTs, which neutralizes the threat of asking for an extension. However, students may 
feel less threatened in sending an e-mail in which they apologize, justify their lateness, 
and/or commit themselves to submitting work by a certain time rather than actually 
requesting an extension. 

3.2.2. comparison of collection methods by situation. Analyses of the level 
of directness of requests for each pragmatic situation also reveal differences in data 
associated with data collection method. Figure 3 provides percentages of requests by 
level of directness for Situation 1 (requesting after the due date without attaching 
work) and shows that differences between NSs and NNSs are more apparent in e- 
mail than in the DCTs. Further, for both groups of participants, there is greater use 
of direct and conventionally indirect requests and a decrease in unconventionally 
indirect requests in DCT responses. This general tendency toward more directness 
in the DCT responses is consistent with Hartford and Bardovi-Harlig’s (1992) find¬ 
ings that NNSs used more assertive forms of requests in DCTs. The testing nature 
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Figure 4. E-mail vs. DCTfor Situation 2. 
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Figure 5. E-mail vs. DCT for Situation 3. 

of the DCT may have prompted students to make an explicit request. The reality of 
social implications may also account for less directness in e-mail. In e-mail requests 
to real professors—especially in the situation of asking for an extension after the due 
date—hints may be seen as safer. 

Figure 4 represents the distribution of requests by directness level for situation 2, 
requesting an extension before the due date—the least face-threatening of the three 
situations. The results reveal tendencies very similar to those for Situation 1. In gen¬ 
eral, students are more direct in their responses to the DCT. Both groups use fewer 
hints in the DCTs, and NNSs use more direct requests than they do in e-mail. 

As Figure 5 shows, the results for Situation 3 (requesting an extension after the due 
date and attaching work with the request) differ somewhat from those of the previ¬ 
ous two situations. Similar to the other two situations, there is a greater use of direct 
requests in DCTs, and fewer hints in DCT responses, at least from NNSs. Interest¬ 
ingly, there are no occurrences of direct forms in the e-mail messages of either group. 
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This is possibly due to the bold, face threatening nature of asking for an extension 
after the due date while submitting the assignment at the same time. Since the threat 
does not really exist in responding to DCTs, students are more likely to provide such 
requests, neglecting the social protocol. The differences across groups are also inter¬ 
esting. As pointed out above, while the profiles for NSs and NNSs in the DCT data are 
again similar, the two groups appear quite different in the e-mail data. 

3.3. Variations in Linguistic Forms. The two data collection methods yielded 
notable differences in how students linguistically realized their requests. Contrary to 
previous research (Beebe & Cummings 1996; Hartford & Bardovi-Harlig 1992), the 
DCTs in this study yielded a greater variety of linguistic forms than did the e-mail 
data. Four linguistic forms-performatives (I am writing to request ..., Can you.... 
Would you..., and Do you think.. .)-occurred in DCT responses but not in e-mail 
requests. This might be due to the smaller sample of e-mail messages, but differences 
across collection methods occur for each group also. NSs used Need statements and I 
wanted to know... in e-mail but not in DCT responses. Conversely, they used impera¬ 
tives, Would you..., and Preference statements in DCT responses but not in e-mail. 
They also rarely used Acceptability conditionals in DCT responses but relatively often 
in e-mail. NNSs, on the other hand, did not use, or rarely use Would you... and Could 
you... in e-mail but often used both forms in DCT responses. 

4. Conclusions and Implications. The comparison of naturalistic and elicited data 
collection methods has shown that both collection methods limit the amount and 
type of data collected. While the DCT offered insights into reasons for variations in 
language use due to the design of the DCT and comments made by the respondents, 
this collection method also promoted a greater use of direct (all but NSs in situation 
2) and conventionally indirect request forms (all but NSs in situation 3) and a lesser 
use of hints (all but NSs in situation 3) than was gathered through the e-mail data. 
Also, the occurrence of semantic formulae in the realization of the entire speech act 
varied across collection method for both groups. It is possible that the artificiality and 
test-like nature of the DCT accounts for these findings; the risks entailed in use of 
direct forms are neutralized in a contrived situation with no real-life consequences. 
In addition, the profiles of NSs and NNSs looked rather similar when considering the 
DCT data, but differences between NSs and NNSs were more obvious in the natural¬ 
istic data. Thus, researchers studying cross-cultural pragmatic differences in request 
forms would get a very different picture of the differences and similarities of the two 
groups depending on the data collected. 

Variations in linguistic form across the two methods yielded a comparatively 
small range of forms in the naturalistic data, which is not consistent with previ¬ 
ous studies (Beebe & Cummings 1996; Hartford & Bardovi-Harlig 1992; Rintell & 
Mitchell 1989) but might be the result of the strictly controlled context and stable 
relationship between requester and request granter. The two collection methods also 
produced interesting differences between NSs and NNSs: Could you.... Can you... 
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constructions were used by NNSs, but not by NSs, pointing to NNSs’ lack of prag¬ 
matic awareness. 

As each method of data collection has advantages and shortcomings, a triangula¬ 
tion of methods is ideal (Cohen 2001; Hartford & Bardovi-Harlig 1992; Kasper & 
Dahl 1991). Natural data allow researchers to identify salient characteristics of prag¬ 
matic context which can be tested in DCTs; in contrast, DCTs can reveal explanations 
for variations in naturally occurring data, e.g., some DCT respondents indicated that 
they would not make certain types of requests via e-mail. However, an important 
conclusion from the present study remains that some pragmatic differences may not 
surface in DCTs, but do surface in natural data; thus, speech act research cannot rely 
on elicited data alone if it aims at drawing reliable conclusions about NSs’ and NNSs’ 
pragmatic performances. 


APPENDIX: DISCOURSE COMPLETION TASK 

Directions: Read the following situations and then write the request which you 
would e-mail to your professor. 

1. You did not submit a major assignment on time and have not asked for permis¬ 
sion to submit it late. You still have not completed the assignment and you now 
want to ask the professor for permission to submit the assignment late. 

Assuming you have a good reason for your request, in your e-mail message 
you write: 


Assuming you do not have a good reason for your request, in your e-mail mes¬ 
sage you write: 


2. Approximately 2 weeks before a major assignment is due, you conclude that it 
will not be possible for you to complete the assignment and submit it on time. 
You want to ask the professor for permission to submit the assignment late. 

Assuming you have a good reason for your request, in your e-mail message 
you write: 


Assuming you do not have a good reason for your request, in your e-mail mes¬ 
sage you write: 











EVIDENTIAL RELIABILITY IN PRAGMATICS 


143 


3. You did not submit a major assignment on time and have not asked for permis¬ 
sion to submit it late. You decide to e-mail the assignment and attach the assign¬ 
ment 2-3 days late. 

Assuming you have a good reason for your request, in your e-mail message 
you write: 


Assuming you do not have a good reason for your request, in your e-mail mes¬ 
sage you write: 
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ISSUES IN HARD-SCIENCE LINGUISTICS 


Victor H. Yngve 
University of Chicago 


What is human or hard-science linguistics? I take linguistics in the usual sense 
as a discipline with a research community, a cohesive literature, faculty and students 
in academic departments, and organized professional societies. The term linguistics 
does not rule out interest in systems of communicating used by the deaf, or interest in 
facial expressions and body motion. Linguistics in this broad sense embraces a freely 
expandable scope of interrelated phenomena that have long been recognized as being 
both individual and social in nature. 

Essentially all leading linguists for the past two centuries have accepted the goal 
that linguistics is or should be a science. No particular body of theory, however, 
grammatical or other, has emerged as acceptable to all. Indeed, linguistic theory has 
suffered frequent changes as one school after another has achieved a measure of dom¬ 
inance. And the proper shape of linguistic theory has throughout history been the 
subject of heated but inconclusive debate. 

1. linguistics and science. So what, then, is hard-science linguistics? It is called 
hard-science linguistics to distinguish it from most current brands of linguistics 
which are properly characterized as soft science. There are a number of crucial differ¬ 
ences that can be a source of confusions if they are not kept well in mind: 

(1) Hard-science linguistics is a natural science, like physics, chemistry and 
biology. This is its major difference from current approaches, which are phil¬ 
osophically based. 

(2) Hard-science linguistics takes science seriously. As a natural science it stud¬ 
ies parts of the real physical world. Current linguistics, on the other hand, 
studies non-physical constructs. 

(3) Thus hard-science linguistics focuses primarily on people from the point 
of view of how they communicate, and on the sound waves of speech, the 
light waves of gestures, and other physical means of communicative energy 
flow. It also includes consideration of other relevant parts of the real physi¬ 
cal world. Soft-science linguistics, on the other hand, focuses on nonphysical 
constructs such as language and signs. 

(4) In the hard sciences it is standard practice to test the predictions of theory 
against the real world through careful observation and experiment. But in 
the soft sciences theories are not testable against the real world. 
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In the hard sciences theories are testable because the hard sciences study 
observable real-world objects, whereas the soft sciences study unobservable 
nonphysical objects, for which there is no objective evidence. 

The hard sciences have developed standard objective criteria for deciding 
what to believe about the natural world. 

(a) The criterion for assessing theories is that their predictions must agree 
with the results of tests through real-world observations and experi¬ 
ments. 

(b) The criterion for assessing observational and experimental results is 
their reproducibility. 

These criteria cannot be applied in the soft sciences, which do not study the real 
world. They are forced to fall back on various a priori philosophical criteria such as 
simplicity, symmetry, ‘naturalness’, etc. or on sheer intuition about what is plausible. 
Investigators can and do differ on these matters, which are subjective and arbitrary. 
The result is that the discipline then drifts from one fashionable body of theory to 
another and from the dictates of one charismatic linguist to another. 

(7) The hard sciences have paid careful attention to what assumptions they are 
willing to accept. They have pared them down to only four standard assump¬ 
tions. The soft sciences, on the other hand, freely admit as many untestable 
assumptions as they wish. The four standard assumptions of the hard sci¬ 
ences are: 

(a) that there is a real world out there to be studied; 

(b) that it is coherent, so we have a chance of finding out something about it; 

(c) that we can reach valid conclusions by reasoning from valid premises; 

(d) that observed effects flow from immediate real-world causes. 

All other assumptions have been converted into hypotheses to be tested. Those that 
do not pass tests against real-world evidence or that are untestable have been elimi¬ 
nated. Soft-science linguistics, however, typically accepts a number of scientifically 
unjustified special assumptions that take it outside of the natural sciences. There 
are assumptions about utterances, language, meanings, signs, and typically dozens of 
others, either explicit or implicit and hidden. 

2. two incompatible goals and a dilemma. It has not been generally realized that 
the current difficulties in linguistics stem in large part from the incompatibility of 
the modern goal of making linguistics a science and the traditional goal of studying 
language. 

Accepting language as an object of study leads to accepting the scientifically unjus¬ 
tifiable special assumptions of a philosophically-based program of grammatical and 
semiotic research that can be traced back to the ancients. In hard-science linguistics 
we must continually be on guard against traditional soft-science assumptions that 
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threaten to lead us astray. It’s a question of priority of goals. If we give priority to 
studying language, we cannot have a true science. If we give priority to science, we 
must give up the goal of studying language. 

Giving up language in favor of science would be a victory for linguistics, not a defeat. 
If we study the people who speak and understand rather than studying language, we can 
actually build a genuine hard-science linguistics that can stand among the other natural 
sciences and take the place of the present autonomous soft-science linguistics. 

The other natural sciences are built on hard-science foundations and each has a 
conceptual structure that is specific to its subject matter. Physics studies selected parts 
of the real world from the physical point of view and has concepts of mass, energy, 
momentum, force, and so on. Chemistry studies parts of the real world from a chemi¬ 
cal point of view and has concepts of atoms, molecules, valence, reaction rate, and 
so on. Hard-science linguistics also rests on hard-science foundations rather than on 
the traditional semiotic-grammatical foundations. It studies selected parts of the real 
world, people, from the point of view of how they communicate. To support such 
studies a new subject-matter specific conceptual structure on which to build a new 
hard-science linguistics is now available (Yngve 1996). This replaces the conceptual 
structure of grammar. 

But if we start over and build a new linguistics on hard-science foundations and 
the new conceptual structure, we are faced with a dilemma. There is a vast literature 
accumulated over centuries containing a wealth of linguistic knowledge, almost all 
from a soft-science point of view. Thus it is based on or incorporates many scientifi¬ 
cally unjustified assumptions. The dilemma is whether to try to make use of this vast 
treasure and possibly be misled by it, or to ignore it and risk losing the many valid 
insights it may contain. 

For two centuries we have been trying to make linguistics a science, not by moving 
it onto proper hard-science foundations but by continuing to give priority to the 
study of language rather than to science. In this we have been encouraged by those 
philosophers trying to redefine science and legitimize the soft sciences. The result has 
been a soft-science linguistics dedicated to studying an object, language, introduced 
only by scientifically unjustified and untestable assumptions. After two centuries it is 
now clear that this course has led only to confusion and chaos in the discipline. 

What we must do instead to resolve the dilemma is to make use of what is already 
known where possible, but leave the old world of the soft sciences once and for all and 
become pioneers in the new world of the hard-sciences. We must mount a program of 
research to reconstitute linguistics on hard-science foundations and the new concep¬ 
tual structure that is now available. 

3. RECONSTITUTING LINGUISTICS ON HARD-SCIENCE FOUNDATIONS. Let US See how 
one might begin. Suppose two people are in conversation. See Figure 1. There are 
three distinct physical things here (shown below the double line), person A, person 
B, and the sound waves that pass between them. In hard-science linguistics all the 
linguistic structure lies in these two people and in the pair of them communicating 
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Figure 1. Two persons communicating by means of sound waves. (Real-world objects 
are in boldface, the corresponding systems are in light face.) 

by sound waves and other energy flow. The linguistic structure is represented in the 
theory in terms of the properties of systems, where ‘property and ‘system’ are taken 
in their normal sense in the natural sciences. The communicating individuals, par¬ 
ticipants, role parts, and the linkage are all systems in this sense. The sound waves are 
seen simply for what they physically are, pure energy flow: They carry no linguistic 
structure at all. 

In the tradition, on the other hand, the sound waves are seen as somehow support¬ 
ing assumed utterances that are analyzed in terms of some theory of language, L, and 
that in some mysterious way carry a meaning or message or information from one 
person to the other, who are then seen as ‘using’ language. 

Note that the linguistic structure postulated for utterances is not at all inherent in the 
physical sound waves. It cannot be recorded by instruments. The similarities and differ¬ 
ences, on which phonemes were postulated for example, require a person to perceive 
them, and different persons perceive them differently. Traditional theory does not take 
any persons into account. A proper scientific linguistics must be a linguistics focused 
not on language but on people, for that’s where the linguistic structure really is. 

I have chosen this example of husband and wife for clarity because we happen to 
have a name for the two of them, a couple, The couple is part of the real physical 
world. You could invite them to dinner. The couple, together with the sound waves 
and other relevant parts of the real world, are represented in the theory as a linkage, 
which is characterized by properties in the same way as the other systems. 

Besides couples there are a variety of other assemblages of multiple persons and 
objects that can likewise be analyzed as linkages. Linkages can participate in higher- 
level linkages in a hierarchy of individuals and linkages. In this way we can represent 
the linguistic structure of the many systems in a complex community or society. 
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4. two new domains of theory. So we see that hard-science linguistics provides 
two domains of theory (shown above he double line in Figure 1): domain 1 at the 
individual and participant levels (below the single line), and domain 2 at the role part 
and linkage levels (above the single line). This represents the fact that communicating 
is both individual and social. These two domains of theory differ in the real-world 
evidence, individual or social, on which the properties of the systems are postulated. 
Properties of individuals and properties of participants, 1, are set up on the basis of 
observed similarities and differences of individual persons. Properties of role parts 
and properties of linkages, 2, are set up on the basis of observed similarities and 
differences of assemblages of people and objects at the social level. Comparing two 
objects separately with a physical measuring scale is, of course, equivalent to compar¬ 
ing them directly with each other. 

The tradition, on the other hand, provides only one domain of theory, language, 
and it has never been clear whether it is individual or social or some abstraction 
which is neither. 

Note that Chomsky’s ideal speaker-hearer, SH in figure 1, does not answer to any 
real-world evidence from real persons. There is nobody underlying it that you could 
invite to dinner. Incredibly, it is defined entirely in terms of language, which does not 
exist in the real world, and is completely subservient to the assumptions and defini¬ 
tions of a particular linguist. In fact, even the structured utterances here, or the text in 
other theories, has to be introduced by a special subject-matter-specific assumption, 
as Bloomfield already pointed out. 

5. properties of systems. We observe that everyone is different communicatively 
from everyone else. The uniqueness of the individual is expressed in terms of postu¬ 
lated properties of the systems that model them. Properties are set up on the basis 
of observed real-world communicative similarities and differences of persons and 
assemblages. They may be taken as binary variables without loss of generality. These 
properties can be represented here as: 

ABCDEFGHIJKLMNOPQRSTU..., 

where the underlined properties figure in the discussion 
below concerning how properties change. 

Hard-science linguistics is often called human linguistics to distinguish it from the 
traditional linguistics of language. The name human linguistics is particularly apt 
since its theory is founded on the very uniqueness of individuals and groups cele¬ 
brated in the humanities rather than on the normative basis of grammar which would 
make everyone the same. 

Properties change dynamically as people learn and speak and understand. Some 
properties change quite rapidly, others may stay fixed for longer periods of time. The 
knowledge of how to communicate is represented in terms of properties. 
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We also observe that communicative behavior is heavily context dependent. Prop¬ 
erties reflect both inputs to the system and its current state, which represents the cur¬ 
rent context in which the ongoing communicative behavior is understood. 

It has been shown that the structures of properties in the various systems such as 
individuals, participants, role parts, and linkages can all be organized according to the 
same conceptual structure and formal theory. 

6. procedures. Properties are structured in part in terms of procedures. Procedures 
are dynamic causal laws of communicative behavior postulated on the basis of obser¬ 
vational and experimental evidence from the real-world objects modeled. A proce¬ 
dure specifies how some property changes value in dependence on the current values 
of other properties, some of which may represent the situation or context, others the 
results of inputs. 

Procedures are triggered when specific properties (inputs and context) take the 
values specified in the logic expression on the left. They then change the value of a 
property as specified on the right after a specified time delay At, as for example: 


Cx-G vN :: Q, At, 

where x is and’, v is ‘or’, - is ‘not’ and :: is read as ‘sets’ (the indicated 
property on the right to the indicated value). 

Thus communicative behavior results in change, not accretion, so hard-science lin¬ 
guistics is basically pragmatic in its foundations, in contrast to the tradition, which 
treats pragmatics as an afterthought if at all. 

The current situation or context that affects ongoing communicative behavior is rep¬ 
resented in what is called the domain of control. The execution of procedures in depen¬ 
dence on the dynamically changing properties in the domain of control answers to what 
would be spoken of colloquially as a person following a conversation or being ‘with if. 
This accords with the observation that what a person says or understands in any given 
situation depends on the situation, which is dynamically changing. Thus the proper 
handling of context is a central feature of the theory, and a major difference from gram¬ 
mar. Rules of grammar do not involve the situation or context in this ongoing sense. 
They are generally set up on the basis of examples taken out of context. 

The term human linguistics is thus apt for another reason. Since it focuses on what 
individual persons do and say and understand in particular circumstances, it accom¬ 
modates the uniqueness of situations of the humanities. It can do this in a true sci¬ 
ence because it generalizes in terms of individual properties rather than in terms of a 
whole language. 

The linguistics of language, in concentrating on separating the grammatical 
from the ungrammatical, is not far removed from the prescriptive tradition and 
ideals of correctness. 

Procedures are often organized in terms of a hierarchy of communicative tasks and 
subtasks. There can also be parallel tasks to accommodate the possibility that a person 
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Figure 2. Two people bargaining over the price for an antique. (Real-world objects are 
in boldface, the corresponding systems are in light face.) 

or a group may do several things at the same time. Communicative tasks are often sub¬ 
tasks of nonlinguistic tasks representing what it is that the communicative behavior is 
coordinating. Thus hard-science linguistics is closely connected with other social and 
social psychological disciplines and interfaces naturally with practical affairs. It does not 
suffer the isolation of language and autonomous grammar. 

The organization of properties, procedures, task hierarchies, etc. of a system is 
called the plex structure of that system. So in hard-science linguistics we observe and 
experiment on how people communicate so as to postulate and test plex structures 
for them. This is quite different from taking people as informants or witnesses for the 
study of language. 

If we keep well in mind the kinds of differences highlighted here so as not to be 
confused or misled by them, it should be a fairly straightforward though lengthy and 
challenging task to reconstitute all of linguistics piece by piece on hard-science foun¬ 
dations. 

7 . evidence. Tet us now look at some more examples with a particular focus on evi¬ 
dence. Evidence is always relative to the theory that it is interpreted in terms of —the 
theory it is designed to support or contradict. Thus it is also relative to any hidden 
assumptions underlying that theory. Hidden false assumptions can lead to false inter¬ 
pretations. The depth hypothesis proved untestable because of hidden false assump¬ 
tions underlying the phrase-structure theory it was interpreted in terms of. 

In hard-science linguistics we seek evidence for properties of systems modeling 
real-world people from the point of view of how they communicate, not evidence 
for properties of immaterial objects like words and sentences. We seek evidence for 
similarities and differences between different persons and the same person at differ¬ 
ent times, and between different groups and the same group at different times. 

Let us take the example of two people bargaining over the price for an antique 
offered for sale. See Figure 2. 
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First, there are properties postulated and tested against evidence from individual per¬ 
sons. Here at the individual level there are general properties representing the person’s 
knowledge, abilities, aspirations, values, etc. and at the participant level there are properties 
concerned with how much the collector would be willing to give for the antique during 
this bargaining session and properties concerned with how little the dealer would be will¬ 
ing to accept. 

Second, there are properties postulated and tested against evidence from assem¬ 
blages of persons and other associated real-world objects. Here at the role-part level 
there are properties concerned with customer offering, clerk refusing, making a coun¬ 
ter offer, customer making a show of walking out, etc., and at the linkage level, there 
are properties representing the current changing state of negotiation of the sales 
group, the antique, the money, and other relevant real objects. 

As another example, consider two strangers getting acquainted, as videotaped and 
described in earlier publications. 

Here at the individual level we find properties representing the personal back¬ 
grounds of the two persons (that they later discuss) and at the participant level there are 
properties representing the original request of the researcher separately to each, what 
little he told them about each other, and the task he gave them of getting acquainted. 

Regarding social properties: at the role part level are properties representing each 
persons moves and responses in dialog, and such things as checking with each other 
on the extent of their actual commonalty in knowing what the task is. Thus we find 
the woman saying, and, I don’t know how much you know about me—at all’, and he 
says, ‘I know nothing about you at all. It’s all a big secret’. To which she replies, All 
right—should I start, then?’ And he says, ‘You start (single nod)’. In this way the infor¬ 
mation about their common assigned task is moved up from the individual partici¬ 
pant level to the social role-part level. They both now know what activity ‘start’ refers 
to. We see them both get acquainted. Then at the linkage level are properties of the 
changing current state of their dialog and their growing acquaintanceship. 

There are also ongoing investigations of historical change, linguistic variation, 
multi-lingualism, translation, etc. (Yngve & Wqsik forthcoming). The specific nature 
of evidence in each case is different and relative to the area under investigation. 

In spite of what some philosophers of science may say, scientific research is not 
focused single-mindedly on so-called falsification. We’re in the physical domain, 
not the logical domain of theorems and proofs. In seeking the truth about nature, 
we emphasize real-world exploration. Research in the physical domain is more like a 
detective trying to solve a crime. There’s no single simple route to a guaranteed solu¬ 
tion. A number of techniques are available to us including field observations, video¬ 
tapes, and interviews. 

In hard-science linguistics, we are all studying the same physical reality, people, 
from the point of view of how they communicate, and the relevant physical sur¬ 
roundings. Thus evidence and theory from one area of linguistics is often relevant 
to the study of questions in other areas and even in neighboring disciplines that 
study people from other points of view. Linguistics thus moves from the isolation of 
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autonomous grammar into the real world of the natural sciences. We can say good¬ 
bye to the era of grammatical fads and fashions and enter a new era for linguistics 
where our theories will have the solidity, permanence, and real-world relevance 
already familiar in the other natural sciences. 
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CLASSICAL ARABIC VS. DIALECTS: A COMPARATIVE STUDY 


Fa’eqa Alsadeqi 
University of Bahrain 


the Arabic language exists in two forms; the classical (sometimes called standard 
or formal, hereafter CA) and the dialects. CA is a highly sophisticated language with 
complex grammatical rules and irregularitues; it is reserved for official use and is 
a lingua franca among Arabic speaking communities. The dialects are spoken lan¬ 
guages that vary from one region to another, and they are sometimes unintelligible to 
Arabs in different parts of the Arab world. 

Similarities between CA and the dialects have led Arab grammarians and linguists 
to compare CA to the dialects now spoken. Some argue that the dialects developed 
from the CA; others say that the CA is a variety of Arabic that was spoken in the past. 
When one examines data from Arabic dialects, one finds many common features that 
are exclusive to dialects come to the surface, indicating that their grammar differs 
from that of CA, should one consider the latter to be the standard. 

This paper is a comparative study of some grammatical features of both forms of 
Arabic. Its purpose is to show through grammatical evidence that CA is a language 
that was probably not spoken but rather contrived for special purposes. (This is not to 
dispute the existence of CA, but its existence as a spoken language.) How and when 
this form of Arabic came about is unsure, because of the lack of evidence from the 
pre-Islamic era, but we know that CA was used for poetic works. Special fairs were set 
up in pre-Islamic times for poetry and literature (Hilaal 1998 and Lutfi 1976). In such 
markets, the language of poetry was the fus-haa ‘classical’, although people reverted 
to their own dialects when communicating in private and social matters (Hilaal 
1998).The famous mu c allaqaat ‘ suspended’ poems were said to have been selected as 
the best poems composed during pre-Islamic times.‘They were hung up in the Ka’aba 
in Mecca on account of their merit; that this distinction was awarded by the judges at 
the fair of c ukaz near Mecca, where poets competed and the successful compositions 
were transcribed in letters of gold (Nicholson 1985). None of these poems reached 
the Arabs in their written form,because the oral tradition was prevalent. To retrieve the 
old poems, Arabs had to rely on narrators such as hammaad, who memorized a lot of 
poetry and could remember them. It is claimed that the Arabs were able to document 
most of the pre-Islamic poems thanks to hammaad 'al-raawiya ‘the narrator’. 

Because Arabs considered poetry to be the best form of literature, little attention 
was paid to prose. Nicholson (1985) states that ‘since the art of writing was neither 
understood nor practised by the heathen Arabs (i.e. Arabs in pre-Islamic times) in 
general, it was impossible that prose, as a literary form, should exist among them’. 
Therefore, most of the documentation in later stages concerned poems. 
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1. classical Arabic. What do we know about CA? To find evidence in a language, we 
need to examine what is written and what is spoken. What can we say about C A based 
on the available written and oral data? 

The current use of CA may shed some light on its use as an exclusive language. To 
summarize, here are some features specific to CA. 

1. Arabic is a Semitic language. Logically, therefore, CA is also a Semitic 
language. 

2. Classical Arabic is the only Semitic language that retains all cases. (Some 
cases are retained in Akkadian as well.) That is, it is a fully vocalised lan¬ 
guage, where cases are realized via the vowels a, i, u and endings. 

3. CA is generally reserved for official use, media reports through television 
and radio, and religious ceremonies. The official news on all Arabic speak¬ 
ing radio stations or television channels is presented in written form to the 
newsreader, who reads it in CA. In interviews or live reports, either CA alone 
or a mixture of both CA and dialect or the dialect alone is used. 

4. CA is a written language that exists in and through texts. 

5. The people who speak CA are the educated class who are either trained 
or trained themselves to use it (often with mistakes) or religious figures or 
scholars. To repeat, you need to study it to speak it fluently. 

The above indicate that CA was and still is an exclusive language. 

2. the grammar of ca. The grammar of CA was first considered during the time 
of the fourth Caliph, Ali bin Abu Taaleb, who agreed that the grammarian Abul 
Aswad Al-Du’ali should write it down (Shaami 1997). The reason is given in a story 
often mentioned in Arabic grammar sources . Before citing this curious story, let me 
explain two grammatical rules to make it easier to understand the text. 

The word maa in classical Arabic has various functions. For example, if it is used 
to form a question, then the noun following it should be in the nominative case, indi¬ 
cated by the vowel lul; if, however, it is used for exclamation, then the noun following 
it should be in the accusative case, indicated by the vowel /a/. 

The story runs as follows. Al-Dua’lis daughter was looking at the sky one day and 
she exclaimed maa ajmalul samaa’, using the nominative case indicated by the vowel 
/u/ (she meant how beautiful is the sky). Her father thought she was asking him a 
question and said ‘What part of it is beautiful?’ The daughter replied ‘I meant how 
beautiful the sky was’. The father replied, ‘In that case you have said maa ajmalal 
samaa’, using the accusative case’. Due to the mistake his daughter made, Al-Du’ali 
was alarmed and decided that the grammar of CA should be written down in order 
to preserve it. So he immediately went to the Caliph, who comissioned him to work 
on this project. 

There are, however, there are no extant written documents on grammar by Al- 
Du’ali, although he is credited with the initial steps. Arabic references say that he 
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invented the dot system to differentiate consonants and the signs used for vocaliza¬ 
tion were later concocted by his students. The story, in my opinion, indicates: (a) CA 
was not a natural language and had to be taught, and (b) people were not used to 
vocalizing the ends of their words in speech. 

The first complete work on CA grammar was documented in the 7th century 
ad by Sibawayh, a scholar of Persian origin and a student of Al-Du’ali. His motive 
was to preserve the language, because people were making mistakes, or lahn ‘sole¬ 
cism’, when using it. Why were people making mistakes? The claim was that many 
nationalities moved to the Arabian peninsula after the introduction of Islam and they 
were affecting Arabic in a negative way. This seems dubious, because the movement 
of non-Arabs to the Arabian peninsula was not prominent until later stages in Islamic 
history (probably at the end of the Ummayyid dynasty). In fact, the languages that 
were spoken in areas conquered by Arabs were supplanted by Arabic or marginalised, 
thus causing some languages, e.g. Coptic and Syriac, to disappear as living languages 
(Versteegh 1997). 

Many historians and linguists feel that the divergence between the CA and the 
vernacular was the principle motivation for the the emergence of grammar as an 
independent discipline (Versteegh 1997). In addition, Arab grammarians wanted to 
unify Arabs via a lingua franca and preserve the Holy Quran, in the which the highest 
or best form of CA is used. Before Islam, Arabic was the language of its own people in 
the Arabian peninsula, but later it became ‘the language of a large empire, in which it 
functioned as the language of religion, culture and administration’ (Versteegh 1997). 
It was logical, then, to have a grammar for the dominant language, so that every 
Moslem, regardless of origin, could understand the Holy Quran, which was a social 
constitution as well as a religious text. Other evidence that CA is linked directly with 
the Holy Quran is the official use I mentioned earlier; media reports find reporters 
and newsreaders using both CA and dialects. This is not the case with religious inter¬ 
views or religious lectures. They are conducted in pure CA. 

3. CLASSICAL ARABIC AND THE DIALECT OF QURAYSH. When philologists tried tO 
codify the Holy Quran, one of the problems they faced was the varied readings of the 
text. The so-called professional readers of the Quran during the time of the Caliphs 
tried to promote their own readings, which were, I suspect, based on their own dia¬ 
lects. As a result, ‘Ibn Majaahid confirmed seven authentic readings of the Holy book 
(Hilaal 1998). ‘Ibn c abbaas followed suit, but said there were ten readings. Scholars, 
however, suggest that more than seven readings of the Quran are acceptable, due to 
the number of tribes and dialects at that time. So despite efforts to unify the language, 
the vernaculars interfered, rendering other readings of the Quran valid. 

Another point worth mentioning is that Arab historians say that the Holy Quran 
was largely based on the dialect of Quraysh, a tribe that lived in the Hijaaz region, 
because the people of this area were the most fluent speakers of Arabic. Yet we find 
many causes for doubt here, arising from what we know about the geography and 
language of Quraysh. I cite some of them now. 
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1. The Prophet Mohammad once said ‘1 speak Arabic fluently, although ( bayda ) 
I am from Quraysh’. The word bayda ‘although’ was interpreted as because by 
some historians, a most exceptional rendering of the word in Arabic. 

2. The geographical map of the Hijaaz region differs from one book to the next. 
Some books say that Hijaaz is the area from the border of Iraq to the western 
coast of Saudi Arabia, others say it extends from central Saudi Arabia to the 
west coast of the peninsula. Thus it is unclear where the Quraysh lived. 

3. Sometimes linguists talked about the difference between the language of 
Quraysh and that of Hijaaz. This is strange because Quraysh was in fact part 
of Hijaaz and not a separate region. Again this question concerns inferences 
drawn from the geography of the region and not the language. 

4. Another problem regards any claim of Qurayshis as more eloquent speakers 
of Arabic than members of neighboring tribes. Normally, the people who 
live close to each other speak almost the same form of language (unless we 
are talking about urban vs. adjacent rural areas). In fact, the literature at 
hand admits that many tribes (e.g. Tamiim, Huthail, Bani ‘assad, Rabii c a) had 
eloquent Arabic speakers. 

5. Some examples of grammatical rules pertaining to case endings contradict 
the rules cited in books. For example, the people of Hijaaz were said to use 
the nominative where the people of Tamiim used the accusative when negat¬ 
ing with the word maa. Upon examination of some poetry of the people of 
these regions we find that the usage is exactly the opposite. If the rule given 
were accurate, the poetic usage would be unnatural. A person would apply 
the rules of his own language or dialect instead of borrowing a rule from 
another region (Al-Sammirraa’i 1997). This casts doubt on the accuracy of 
the documented forms. Here are examples on the usage of maa in the dia¬ 
lects of Tamiim and Hijaaz. 

Tamiim: maa \al-darsu sahlun (acc) 

Hijaaz: maa \al-darsu sahlan (nom) 

not the-lesson easy 
‘The lesson is not easy’ 

6. In examples cited by Arab grammarians concerning the pronunciation of 
the glottal stop in the accusative case, the people of Tamiim retain the sound 
while the people of Hijaaz change it. In CA the glottal stop is pronounced. 
Here is an example cited by Al-Sammirraa’i (1994:41): 

Tamiim 'amlaytuhu ‘imlaa’an (as in the classical form) 

Hijaaz ‘amlaythu ‘imlaalan (changing the glottal stop into III 

In view of the above, we can conclude that the Quran was based on the dialect of 
Qurayshis not because of their mastery of Arabic, but for the following reasons: 
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Language 

dual masc. 

dual fem. 

plural masc. 

plural fem. 

CA 

Katabaa 

Katabataa 

Katabuu 

katabna 

Egyptian 

Katabuu 

Katabuu 

katabuu 

katabuu 

Bahraini 

Kitbau 

Kitbau 

kitbau 

kitbau 


Table 1. Comparison ofCA and dialectal forms o/kataba ‘wrote’for masculine and 
feminine dual and plural. 


Language 

plural nominative 

plural accusative 

plural genitive 

CA 

Kaatibuun 

Kaatibiin 

kaatibiin 

Bahraini 

Katbiin 

Katbiin 

Katbiin 

Lebanese 

Kaatbiin 

Kaatbiin 

Kaatbiin 


Table 2. Comparison ofCA and dialectal forms o/kaatibun ‘writer, writing (gerund)’ 
for nominative, accusative and genitive plural. 

1. The Prophet Muhammad was from Quraysh and since the Holy Quran was 
revealed to him, it would be in his dialect. 

2. Quraysh was the hub of commercial and literary activity. It gathered people 
from all tribes and regions. As a result, its language became a mixture of 
various dialects that could be understood by people from different regions. 
This implies that it was not a pure form of Arabic, but a dialect mixture. 

4. ca vs. spoken Arabic. Many argue that Arabs in the past used to vocalise their 
words fully, as in CA. But when we examine the dialects, the following are noted: 

1. In spoken Arabic and the dialects, there is no vocalization and no case end¬ 
ings, an essential feature of the classical form. For example ‘il-waladfil beit 
‘the boy is at home’ as opposed to CA ‘alwaladu fil baiti (cf. Table 1). 

2. Dual forms and sound feminine plural verb inflections used in CA are 
absent from the dialects. Eg. ‘il-bannat gaaloo ‘the girls said’ as opposed to 
'al-banaatu qulna (cf. Table 2). 

3. Dual pronouns and feminine plural pronouns are absent in the dialects. 
Instead the masculine plural form is used for both the dual and sound femi¬ 
nine. In the classical form humaa refers to they (dual), hunna, refers to they 
(feminine plural) and hum for (masculine plural). In the dialects, however, 
only one form (usually the masculine plural) is used for all three forms of 
the pronoun. 

4. The masculine sound plural in active particles is used with one case ending 
not two, as is the case in CA. In the classical form kaatibuun ‘writers’ (nomi¬ 
native) or kaatibiin, (accusative and genitive), in the dialects only kaatbiin. 
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Language 

Type 

Sentence 

‘My mother wrote the letter’ 

CA 

VSO 

katabat (V) ummii (S) ar-risaala ( 0 ) 
Wrote mother+my the letter 

dialect 1 

SVO 

ummii (S) kitbat(V) ir-risaala (S) 
mother+my wrote the letter 

dialect 2 

SVO 

Immii (S) katabit (V) ir-risaale (S) 
mother+my wrote the letter 


Table 3. Word order variations between CA and dialects. 

The same applies to dual forms: in CA kitaabaan ‘two writers’ or kitabayni, 
in the dialects, ktaabein. 

5. Compound numbers (11 to 19) are pronounced as (ito9) +10. Thus 13 in Arabic 

would be 3 +10 thalaatha c ashar. They have one form in dialect, e.g thalata c ash 
‘thirteen in the Gulf dialect, regardless of the gender of the noun counted. In 
CA, however, they are written or pronounced in two ways, depending on the 
following noun: the first part of the compound number (three) has gender 
opposite to that of the noun and the second part (ten) has the same gender. To 
clarify this, here is an example using the number 13 in CA: 

13 = three + ten thalaatha c ashar 

13 books (masc) thalaathata c ashara kitaaban (ta is the feminine suffix) 

13 schools (fern) thalaatha c ashrata madrasatan 

In addition to the above, the variety between dialects is based on lexical items and 
phonological differences, but the grammar is almost uniform. In the three major 
dialect groups, Gulf, Levantine, and North African Arabic, we find different words 
to reflect the same meaning: ‘he wants’, for example, would be yabbi (Bahrain), baddu 
(Lebanon) , or c aawiz (Egypt). This parallels lexical differences in English: American 
English has truck and apartment, British English has lorry and flat. 

The phonological differences between dialects and CA lie in vowel lengths and 
specific consonants. The consonants include the emphatic phonemes (dh, zh, th, q, j). 

The grammatical structures are parallel in almost all forms of spoken Arabic. 
However, SVO is favored in verbal sentences when the subject is third person, while 
in other cases, we find VSO (cf. Table 3). Linguists classify CA as a VSO language. 
This is an overgeneralization, since Arabic also has verbless sentences with the subject 
before the predicate and it also allows OVS. This again shows that dialects differ in 
their grammatical preferences, if not their structures, when compared to CA. (For a 
detailed study of some dialects, see Brustad 2000). 

Why then do we find only these features common to all dialects? Why can’t we 
find some grammatical elements that are specific to (at least) certain dialects and why 
(if Arabs used to vocalise the ends of their words) do we not find a tribe or people 
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who still do? It is true that language changes over time, yet CA has not changed much 
thanks to the preserving influence of the Holy Quran. But if this is the case, then one 
would expect to find a region where this formal type of Arabic was spoken. 

The problem encountered when studying Arabic linguistically is due to the fact that 
little attention was paid to documenting the dialects in the past. Whenever examples 
were cited, they would be to prove a grammatical point related to CA. The late linguist 
‘Ibrahim ‘Al-Saamirraai disputed the use of such examples, saying that grammarians 
concocted them to prove or support a rule in CA grammar (‘Al-Sammirraai 1994). 
Hilaal (1998) says that when the grammar of Arabic was written, the dialects, were 
neglected and whenever they were mentioned, they were considered part of CA. The 
grammarians would mould the dialects to force them into the framework of CA, and 
if the examples did not fit, they would be considered ugly or deviant (Hilaal 1998). 

5. conclusions. The authenticity of documented material on CA is doubtful, making it 
difficult to find conclusive proof about the existence of this form of Arabic as a spoken 
language in the past and, if the dialects diverged from CA, then we lack evidence per¬ 
taining to the linguistic drift. In addition, the dearth of documentation on Spoken 
Arabic among various tribes and city dwellers poses another problem for researchers. 

Based on what is available on both forms of Arabic, I conclude that CA was prob¬ 
ably not a spoken language and therefore the dialects may have not been derived from 
it. The opposite makes more sense. That is, CA was based on the dialects to make 
communication easier. Also, since we find no traces of case endings in the dialects, I 
conclude that the dialect transcriptions need not be contemporaneous with CA, since 
the latter has not changed. Finally, though CA first came about to facilitate commu¬ 
nication among different tribes, it became a unifying force in Islam, intended for all 
Moslems, be they Arabs or non-Arabs. 

The study of the grammar of CA has been exhausted. Most books repeat whatever 
Sibawayh and his students mentioned with little modification, making no significant 
contribution to the field. Dialects were ignored for such a long time and in the case of 
Arabic were sacrificed for the sake of the Classical form. This makes it difficult to find 
definite answers to some questions. Fortunately, dialectology has developed consider¬ 
ably in the past one hundred years or so, and this should encourage more studies that 
can be documented on current Arabic dialects, unlike old dialects that have almost all 
been lost because they were not studied in their entirety. 
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PRONOMINAL EVIDENCE IN SLAVIC AND THE MEANING OF CASES 


Barbara Bacz 
Universite Laval 

pronouns, in contrast to the other part-of-speech categories traditionally dis¬ 
tinguished in the IE languages (nouns, verbs, adjectives, adverbs and even preposi¬ 
tions) are characterized by the absence of lexical content 1 . Therefore, they are the 
best representatives of purely grammatical meaning, and their different forms can be 
regarded as legitimate indicators of possible differences in that meaning. 

In a footnote to his 1936 classic on case meaning Beitrag zur allgemeinen Kasu- 
slehre, Jakobson suggests that the various morphemic oppositions observed in the 
forms of pronouns denote semantic differences. He mentions three such differences 
in Slavic signaled by the formal, morphemic oppositions in Slavic pronouns: a) the 
difference ‘animate’versus ‘inanimate’ manifested through the opposition of k and c in 
the declension paradigm of the Russian pronouns kfo-N.Sg‘who’ and cfo-N. Sg.what’; 
b) the difference in the grammatical category of person indicated through the opposi¬ 
tion of the Russian ja T versus ty ‘you (sg)’, and on ‘he’, etc., and c), most significant 
for the discussion to follow, the difference in the grammatical category of case with 
respect to the case’s ‘relatedness’ to a preposition, manifested in the j- In - morphemic 
opposition in the third person Slavic pronouns (Jakobson 1936/1995:535, footnote 17, 
emphasis added): 

(1) ‘The pronouns, which, in contrast to the other parts of speech, express not 
real but formal meaning in their root morpheme, often denote by their 
root morpheme such semantic differences as are otherwise conveyed as 
morphological or syntactic oppositions: on the one hand, the categories of 
animacy and inanimacy (opposition of the root morphemes k and c: kto 
[N] [who] and cto [N] [what],kogo [G] [whose] and cego [G] [of what], 
etc.), of person (ja [I], ty [you (sg.)], on [he]) and, on the other hand, 
in highly unusual fashion the opposition of relatedness versus unrelated¬ 
ness to a prepositional construction, which is consistently expressed in 
third person pronouns by the distinction n’ versus j: nego-jego, nemu- 
jemu [he], nee-jee [she], and so forth.’ 

1. prepositionless and prepositional forms of pronouns. In case languages such 
as Polish or Russian, the grammatical category of case manifests itself in discourse 
(i.e. in actual usage) under two forms: a form without a preposition, in cognitivist case 
semantics referred to as prepositionless case (also known as‘morphological’or‘synthetic’ 
case) and a form representing a combination of a preposition and a case-marked cat¬ 
egory, known as prepositional case (sometimes also referred to as ‘analytical’ case). In the 
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casual paradigm of both Polish and Russian, third person pronouns {on, ona, ono, oni, 
one in Polish) have two contrasting morphological forms, the j-form and the n’-form, 
which are in complementary distribution: the j-form is found in prepositionless uses 
of a given case (such as the prepositionless adnominal genitive in jego-G. ojciec ‘his 
father’), and the n’-form occurs in preposition+case combinations, such as the preposi¬ 
tional genitive with do ‘to’ in Idg do niego-G.‘l am going to him. The difference between 
the prepositionless j-forms and the prepositional n’-forms is illustrated by the uses 
of the Polish third person masculine pronoun on ‘he’ quoted in (2). 

(2) Prepositionless j-forms: Prepositional n’-forms: 

G. jego ‘his’ do/do/z ... + niego ‘to/from/of...him’ 

D. jemu'him’ ku/wbrew... + niemu‘to/against...him’ 

A. jego‘him przez/w/na..+niego‘by/in/on.... him’ 

The complete declensional paradigm for the third person pronouns in Polish is repro¬ 
duced in Table 1. 

As shown by the examples in (2) and the pronominal paradigm in Table 1, the dis¬ 
tribution of the;- and n- forms in the third person pronouns in Polish is very system¬ 
atic, the two pronominal forms corresponding almost perfectly to the prepositionless 
and the prepositional uses of the Polish cases 2 . 

2. SEMANTIC DIFFERENCE BETWEEN PREPOSITIONLESS AND PREPOSITIONAL CASES. 

On the assumption that a difference in form indicates a difference in meaning (an 
assumption which underlies the research of both Jakobson and the contemporary 
cognitivist semanticists such as Langacker, Rudzka-Ostyn, and Janda), the system¬ 
atic j-/n - opposition in Slavic pronouns suggests that there is a semantic difference 
between the prepositionless and the prepositional forms of a given case. In terms of 
grammar, the difference between the two forms of case can be attributed to the formal 
(structural) opposition between two categories belonging to two different grammati¬ 
cal levels: the morphological category of a word, represented by prepositionless case, 
and the syntactic category of a phrase, represented by prepositional case. In terms 
of the semantics [i.e. the underlying mental representation] of the case-marked ele¬ 
ments, the j-/n - opposition in the pronominal paradigm indicates a distinction in 
meaning between a case-marked bare noun and the same case-marked noun used 
in a prepositional phrase. That means, to give a practical example, that the speaker’s 
conceptualization (mental representation) of the accusative-marked lexical item 
tydzien ‘a week’ in (3) a is not identical to the conceptualization of the same, accusa¬ 
tive-marked noun in (3)b: 

(3) a. Pracowal tam tydzien-Acc. (prepositionless accusative) 

He worked there a week. 

b. Pracowal tam przez tydzien-Acc. (prepositional accusative) 

He worked there for a week. 
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Masculine 

Neuter 

Feminine 


N. 

on 

ono 

ona 


G. 

jego, go, niego 

jej, niej 

3 

D. 

jemu, mu, 

niemu 

jej, niej 

W) 

A. 

j e go, go, niego 

je, nie 

jq, niq 

I. 

nim 

niq 


L. 

( 0) nim 

(0) niej 



Masculine 

Either 

Non-masculine 


N. 

oni 


one 


G. 


ich, nich 



D. 


im, nim 


3 

Jh 

P 

A. 

ich, nich 

d 

je, nie 

S 

I. 


nimi 



L. 


(0) nich 



Table 1. Declensional paradigm of the third person pronouns in Polish (based on 
Doroszewski & Wieczorkiewicz 1972:91-92). 

The difference between two conceptualizations of the same noun with the same case- 
marked nominal, which depends on whether it is used with or without a preposition, 
is very difficult to specify because the relationship between the event (the subjects 
working) and the temporal setting of the event (a week) indicated by the case-marker 
remains the same. Most grammarians agree that the presence of a preposition in the 
preposition + case combination makes more specific the relationship expressed by 
the case-marker. The j-hri- pronominal contrast additionally suggests that the pres¬ 
ence of a preposition also affects our mental representation (construal or conceptual¬ 
ization) of the case-marked lexical item, and even if the semantic difference between 
a case-marked nominal and the same case-marked nominal combined with a prepo¬ 
sition is very small, its existence has to be acknowledged 3 . 

Case semanticists have tried to define the semantic difference between preposi¬ 
tionless and prepositional cases. Jakobson (1936/1995:339) stated it as follows: ‘In a 
language which combines a system of prepositional constructions with an indepen¬ 
dent system of case, the meanings of the two systems are different in the sense that 
when prepositions are used, the relation itself is focused upon, while in constructions 
without prepositions the relation becomes a kind of property of the object denoted’. 
Langacker (1992) attempted to pinpoint the difference in terms of the Cognitive Lin¬ 
guistics framework by providing two image-schema models of the instrumental case. 
In my opinion, his (1992:301) explanation of the difference between the preposition¬ 
less (which he calls ‘true’) instrumental indicated by the instrumental case-marker 
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in typical case languages and the prepositional phrase with what he calls the ‘instru¬ 
mental preposition’ in typical non-case languages, such as English, is essentially the 
same as Jakobson’s. 

In Langackers image-schema of the prepositionless instrumental, the preposi¬ 
tionless form profiles the thing denoted by the case-marked nominal. ‘The true 
case-marker profiles a schematically characterized thing and incorporates some spec¬ 
ification of its role in the process’ (Langacker 1992:301). Conversely, in the prep¬ 
ositional construction, the relationship between the event and the intermediary 
participant in the event, the instrument, is profiled by the instrumental preposition 
(ibid.). In other words, in the prepositionless use, the property of being an instrument 
is ascribed to the case-marked nominal, which then, at a higher level of organization, 
enters into a relationship with the process evoked by the clause, whereas in the prepo¬ 
sitional use, the relational property of being affected via an instrument is part of the 
process evoked by the clause 4 . 

3. history of the n’ pronouns. The opposition in Slavic pronouns is, to my 
knowledge, the only piece of evidence in linguistic form for postulating a semantic 
difference between prepositionless and prepositional cases. In view of the fact that the 
difference in meaning between the two forms of case is not readily apparent (and can 
be conveyed to a non-linguist merely as a difference in focus), the question of the reli¬ 
ability of the pronominal;'- versus ri- evidence can be raised. The issue of whether the 
j-/ri- opposition in Slavic constitutes satisfactory linguistic evidence becomes even 
more of a problem when the history of the;-/n’- opposition and the origin of the n- 
pronouns is considered. 

According to the historical grammars of Polish (e.g. Kuraszkiewicz 1972:130-31), 
the pronominal third person n- forms replaced the original suppletive ;'-forms in the 
declensional paradigm of the pronouns on, ona, ono ‘he, she, it’ 5 when -«, the final 
consonant of the prototypical Slavic prepositions *vbn (modern tv) ‘in’ and *sbn 
(modern z) ‘with’ shifted and mechanically attached itself to the locative and the 
instrumental j-forms of the following pronouns, respectively. The shift is illustrated 
by the examples in (4) taken from Doroszewski and Wieczorkiewicz (1972:92). 

(4) Forms before the shift Forms after the shift 

*vtn-jemb- loc. ‘in him’ w nim- LOC.’in him’ 

*sbtt-jimb- inst. ‘with him’ z nim- inst. ‘with him’ 

The initial n of the prepositional pronominal forms in the locative and the instru¬ 
mental after the prepositions tv ‘in’ and s/z ‘with’ has, with time, generalized to the 
other prepositions used with these cases (such as przy nim- Lnext to him’.po nim- L 
‘after him’, etc.) and to the other prepositional cases: genitive, dative and accusative 
(do niego- G‘to him’, ku niemu-D ‘toward him \przez niego-A ‘because of him’). 

A contemporary example confirming the historical;- to n- shift in the morphemic 
structure of third person pronouns in Slavic can be found in the attested occurrence 
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of both the j- and the n- forms in the prepositional phrase dzigki niemu/ dzifki jemu- 
D ‘thanks to him in modern Polish (Kuraszkiewicz I972:i3i) 6 . 

The original preposition vtn ‘in’ attached to the accusative, as in vbn -jb-Acc. ‘in 
him’, has produced a contracted prepositional form of the masculine accusative of the 
pronoun on ‘he’, the form wen (as in Wpatrywali si% wen [w niego] z niepokojem They 
were staring at him with apprehension - see Dunaj 1996:629 for more examples), and 
later, analogical contracted forms of prepositional pronouns in the accusative and the 
genitive derived from combinations with other prepositions, for example: don (do 
niego) ‘to him’, zen (z niego) ‘from him’, nan (na niego) ‘on him’, przezen (przez niego) 
‘because of him’, etc. In modern Polish contracted n- pronouns are considered a mark 
of very formal, literary style. A few examples of these uses taken from the 16th century 
Polish writer Mikolaj Rej (Kuraszkiewicz 1972:131) and from modern literary Polish 
(Dunaj 1996:629) are given in (s)a and b, and (s)c and d, respectively: 

(5) a. Zgrzytali nan (na niego-A) zfbami. 

They gnashed their teeth at him (= because of him) 

b. Trudno si§ oh (o niego-A) bylo pokusic. 

It was difficult to be tempted about him (= to have him). 

(Kuraszkiewicz 1972:131) 

c. Zwrocili sic dori (do niego-G) z prosb;}. 

They turned to him with a request. 

d. Gotowi byli zan (za niego-G) umrzec. 

They were ready to die for him. 

(Dunaj 1996:629) 

4. reliability of the j -/ n ’ evidence. The shift of the nasal n from the final position 
in a preposition to the initial position in the following pronoun in Slavic can hardly 
be considered to have been motivated semantically. A parallel to the Slavic example 
under discussion is provided by the English words newt (a kind of lizard), which in 
fact stands for an ewt (from the original AS form efeta ‘a lizard’) or a nickname, which 
is an alternative form of the original an eke-name (with the two co-existing forms in 
ME: a nekename = an ekename). A converse shift of the consonant n in English from 
the initial word position onto the preceding indefinite article, observed in the his¬ 
tory of the words apron (formerly napron, from OF naperon < nape), adder (originally 
nadder, from OE naddere ) or umpire ‘a non-pair’ (a more recent version of numpire, 
from ME nonpere, from OF nomper, nompair) has been explained by some historical 
grammarians bluntly as the result of‘a speaker’s mistake’ (see Skeat 1980:5). 

The «’-forms in Slavic pronouns have come into being as a result of a mechanical 
shift of a consonant in a previous stage of a language. Yet, the j-hri- opposition in third 
person pronouns created through that mechanical operation has come to indicate a 
semantic difference between the two pronominal forms, and by extension, between 
the two (prepositionless and prepositional) forms of case. 
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In my opinion, the way in which a particular formal distinction arose diachronic- 
ally is irrelevant to its synchronic semantic status. Grammatical systems of a language 
change just as do individual forms in that language. Thus, the history of a linguistic 
sign should have no necessary bearing on its significance in present-day systems. 

5. HISTORY OF THE TWO POSSESSIVE PRONOUN FORMS IN ENGLISH. The history of the 

j-/ri- opposition in Slavic is similar to the history of the possessive pronouns in Eng¬ 
lish. In the contemporary pronominal system, English possessives are used in dis¬ 
course under two morphological forms: the short,‘adjectival’ form, with the specific 
paradigmatic realizations: my, your, his, her, its, our, your, their, and the long form of 
the ‘possessive pronouns’: mine, yours, his, hers, its, ours, yours, theirs. The two forms 
remain in complementary distribution, the adjectival form being restricted to the 
attributive position in a noun phrase (as in my book), the possessive pronoun occur¬ 
ring in the predicative position only (as in This book is mine). The ‘adjectival’ versus 
‘truly pronominal’ formal opposition in the system of the English possessives indi¬ 
cates clearly (to my mind, at least) that the attributive and the predicative categories 
(specifically, the category of attributive and predicative adjectives) are not semanti¬ 
cally identical, as an early Chomskyan Adjective Transformation would have it. In 
other words, possessive pronoun evidence from modern English, manifested through 
the short versus long form morphological opposition, can be considered to prove the 
existence of a semantic difference between the attributive and the predicative uses of 
a lexical category 7 . 

In the earlier stages of English, however, the distribution of the two forms of the 
possessive pronouns was quite different, and the short (my)/ long (mine) formal opposi¬ 
tion did not indicate a semantic difference between the attributive and the predicative 
systems. The original genitive-case long form of the pronoun was used in the attributive 
as well as in the predicative position in a sentence, and if the two forms were found in 
the prenominal position, the short form (which has lost the final -n) tended to occur 
before nouns beginning with a consonant while the full, long form appeared before 
nouns starting with a vowel (Pyles & Algeo 1993), as illustrated in (6). 

(6) Possessive pronoun distribution in English 

My egg/book.: This egg/book is mine. (Modern English) 

Mi book / min eg/ey. (Middle English) 

6. conclusion. The my/mine example from English shows that the origin of a form 
(such as a mechanical loss of the final -« from the ME min) has nothing to do with 
the form’s grammatical distribution, and consequently, with the form’s significance 
in the modern system. Although created through a mechanical shift, the short pos- 
sesive forms in English, just as the n- forms in Slavic, have eventually become indica¬ 
tors of meaningful oppositions between different grammatical categories. How the 
difference in meaning between these categories should be defined is a matter of the 
linguistic theory at our disposal. A systematic opposition of forms, however, always 
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indicates a difference in meaning, for two different forms in the same category never 
co-exist for long in the same semantic capacity. 


I would like to thank the two lacus Forum reviewers of this paper for their careful and 
inspiring comments. 

The nominative, as the prototypical form of the casual declension, never combines with 
a preposition; thus, it has no corresponding n- form in the pronominal paradigm. The 
locative, which is always prepositional, predictably has no j-form in the pronominal 
declension. The Polish Instrumental, however, which can have both the prepositionless 
and the prepositional realizations (as in: Szedl zolnierz lasem-l. ‘Was walking a soldier 
through the forest’, with the prepositionless instrumental of place, and Cyganka mieszka 
za lasem-l. ‘The Gypsy woman lives beyond/ on the other side of the forest.’, with the 
preposition za+Instrumental combination (prepositional instrumental of place)), is ren¬ 
dered by the n’-form of the pronoun only. Kuraszkiewicz (1972:131) explains this appar¬ 
ent inconsistency in the otherwise strikingly regular pattern of correspondence as an 
overgeneralization of the n’-form which has spread onto the prepositionless uses of the 
case. His illustrative examples are: Idg z nim, z niq, z nimi ‘I am going with him, with her, 
with them’ (prepositional instrumental) versus Gardz$ nim, niq, nimi ‘I despise him, 
her, them’ (prepositionless instrumental). 

It goes without saying that the specific meaning imported by a preposition has to be com¬ 
patible with the meaning of the case the preposition combines with. In example (3)b the 
Polish preposition przez ‘through, across’ combines with no other case but the accusative, 
so the two senses are compatible almost by definition. However, when a preposition com¬ 
bines with more than one case (as does e.g. the Polish preposition w ‘in’, which ‘governs’ 
[(Janda 2000) uses the term ‘motivates’] two cases: the accusative w tydzieh ‘in a week’ and 
the locative w tygodniu ‘during the week’), semantic compatibility of the two elements is 
much harder to establish. 

For a discussion and an interpretation of Langacker’s 1992 graphic schemas of Instrumental 
Case Marker versus Instrumental Preposition, see Baez 2000:10-12. It should be noted that 
Langacker’s explanation of the difference between the ‘true’ (prepositionless) instrumental 
and the instrumental preposition is necessarily cross-linguistic since it is based on examples 
taken from typologically different languages: the ‘true instrumental’ represents a morpho¬ 
logical case in a typical case language while the ‘instrumental preposition’ is illustrated by the 
preposition with in English. The semantic import of preposition+case combinations, typical 
of Slavic, has to be taken into consideration when an explanation of a difference between 
morphological (prepositionless) and prepositional uses of a case is attempted. 

Originally, the forms on, ona, ono denoted demonstrative pronouns - cf. the archaic Polish 
uses: onego czasu-G. ‘(at) that time’ or naonezas- Adverbial‘in/at that moment’, the original 
third person pronouns being : ji,ja,je (see Doroszewski & Wieczorkiewicz 1972:91)] 
According to my native speaker intuition, the expression with the j- form ( dzigki jemu ) 
sounds less natural than the expression with the n- form {dziqki niemu), a fact which sug¬ 
gests that dzigki ‘thanks to’ has clearly become grammaticalized as a preposition here. 
There are other arguments proving that attributive adjectives are semantically different 
from predicative adjectives, e.g., in Russian and in Polish the so-called ‘short adjectives’ 
occur only in the predicative position - cf. Zdrowy i wesoly chlopiec ‘a healthy and happy 
boy’versus Chlopiec jest zdrow/ zdrowy i wesol/wesoly‘The boy is healthy and happy’. In 
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my opinion, the possesive pronoun evidence found in modem English (yet not in Slavic 
or in Latin—cf. Moja ksiqzka ‘my book’ and Ta ksiqzka jest rnoja ‘This book is mine’ in 
Polish) is just one more indicator of a semantic difference between attributive and pred¬ 
icative categories. 
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TOWARD A BETTER UNDERSTANDING OF CLITIC SYSTEMS 


David C. Bennett 
SOAS, London 


this paper 1 aims to contribute to our understanding of clitic systems, from which 
point of view the Slavic languages provide a useful source of data, since they exhibit 
an especially rich variety of systems 2 . The paper addresses the theme of the confer¬ 
ence—the nature of linguistic evidence—in the following three ways. First, since clitics 
represent an intermediate stage between independent words and affixes, they cry out 
to be treated diachronically. Secondly, corpus-based data provide important insights 
into a range of issues. Thirdly, adequate attention needs to be paid to both formal and 
functional considerations. Sections 1.1, 1.2 and 1.3 of the paper expand briefly on each 
of these views in turn. Sections 2 and 3 are devoted to on-going changes in the Polish 
clitic system. Section 4 then briefly discusses the formalization of the diachronic 
development in question; and the paper concludes with a summary of key points in 
section 5. 

1. CLITICS AS LINGUISTIC EVIDENCE. 

1.1. diachronic considerations. Clitics are unstressed, and in many cases shorter, 
counterparts of independent words. Thus the Serbo-Croatian (S-Cr) clitics ga ‘him’ 
and mu ‘to him’ correspond to the full forms njega and njemu, respectively. An over¬ 
view of diachronic developments involving clitics is shown in (1): 

(1) A B C D E 

Independent > P2 clitics > P2 clitics > Verb clitics > Verb affixes 

words (word-based) (constituent-based) 

Second position (P2) clitics sometimes occupy the position after the first accented 
word of a clause (stage B in (1)), even when it is the first word of a complex constitu¬ 
ent, as in S-Cr example (2)a, and sometimes occupy the position after the first syntac¬ 
tic constituent (stage C), as in S-Cr example (2)b 3 . (The clitics in these and subsequent 
examples are italicized.) 

(2) a. moj ce vam sluga dati rjecnik [Cro] 

my aux.future to-you servant give dictionary 

b. moj sluga ce vam dati recnik [Ser] 

my servant aux.future to-you give dictionary 

‘my servant will give you the dictionary’ 
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Moreover, there is ample evidence that word-based P2 represents an older position for 
clitics than constituent-based P2. For S-Cr, Browne writes (1975:114): 

in general it is more old-fashioned and literary to break up a phrase by putting 
the enclitics after the first word... In everyday and conversational style, enclitics 
are more likely to be put after the whole phrase. 

For Slovenian (Sin), Stone reports that the oldest texts reveal examples of clitics 
interrupting complex syntactic constituents (1996:217). However, a Sin translation of 
(2) that parallels (2)a—see (3)a—is ungrammatical, the preferred version being (3)b, 
which is parallel to (2)b: 


(3) a. *moj vam bo 

sluzabnik 

izrocil 

slovar 

[Sin] 

my to-you aux.future servant 

hand-over 

dictionary 


b. moj sluzabnik vam 

bo 

izrocil 

slovar 

[Sin] 

my servant to-you 

aux.future 

hand-over 

dictionary 



Outside of Slavic, Latin (Lat) examples such as (4), in which the subject noun phrase 
populus Romanus ‘the Roman people is interrupted by the clitic reflexive pronoun se, 
represent an older Lat word order than examples in which clitics follow a complex 
initial constituent (Benacchio & Renzi 1987:5): 

(4) populus se Romanus erexit [Lat] 

people itself Roman raised 
‘the Roman people rose up 

With regard to the hypothesized shift from the constituent-based P2 stage—C in (1) —to 
the stage at which the clitics are positioned adjacent to the verb—D in (1) —there is, 
again, ample supporting evidence. For instance, ‘Older Bulgarian was clitic second’ 
(Franks & King 2000:318) but Bulgarian now positions its clitics adjacent to the verb. 
Similarly, Latin had P2 clitics, of both stage B and stage C,but modern Romance lan¬ 
guages such as French and Spanish are at the verb-clitic stage, stage D. And, outside 
of Indo-European, Steele uses Uto-Aztecan data to support the claim that verb-clitic 
systems derive from P2-clitic systems (1977:539). 

Finally, the change from stage D to stage E—i.e., the development of verb affixes 
from verb clitics—is also well attested. This has affected the inflections of the French 
future tense (Decaux 1955:224) and what is now the reflexive suffix -ex (-sja) of Rus¬ 
sian (Jakobson 1971:19). Haas (1977) discusses comparable developments within the 
Muskogean languages. 

The Sin clitics with which we are concerned occur in a fixed sequence in a clitic 
cluster, and in the vast majority of cases the clitic cluster occurs at P 2 syntactically 
defined, i.e., after the first syntactic constituent of a clause. In other words, Sin seems 
to be at stage C of (1). Ser and Cro represent transitional stages between B and C, but 
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are otherwise similar to Sin in that they seem to belong at particular points along 
the scale shown in (1). However, it is by no means necessarily the case that languages 
occupy just one position along this scale. Russian (Rus) is a case in point. On the basis 
of the fact that its reflexive marker is a verb suffix, Rus seems to have reached stage E. 
However, Rus also has an interrogative marker nu ( li ) which is a P2 clitic, and specifi¬ 
cally a stage B P2 clitic which follows the first accented word of its clause, as is seen 
from example (5)a (taken from Franks & King 2000:189) and the fact that (5)b, where 
nu (li) follows the clause-initial complex constituent's ungrammatical 4 : 

(5) a. OHa He 3HaeT b otom nu ayflHTopnn 6yneT /ickuhh 

Ona ne znaet, v etoj li auditorii budet lekcija [Rus] 

she neg know in this Q auditorium will-be lecture 

‘She does not know whether the lecture will be in this auditorium’ 
b. *Ona He 3HaeT b stow ayflHTopMM nu 6yn,eT /ickuhii 

Ona ne znaet, v etoj auditorii li budet lekcija [Rus] 

she neg know in this auditorium Q will-be lecture 

As for auxiliary-verb or pronominal clitics, these have been lost in Russian; and as a 
result Russian does not have a clitic cluster (Franks & King 2000:188). 

The most dramatic shift along the scale shown in (1) is that between P2 clitics and 
verb-clitics. It is this change to which we shall devote most attention below on the 
basis of Polish data. (We shall see, though, that Polish also illustrates the change from 
verb-clitics to verb-affixes.) 

1.2. corpus-based data. It is well-known that the Slavic languages have a rather free 
word order, the function of which is not so much to signal grammatical relations such 
as ‘subject’ and ‘object’ as to signal what information is ‘given and what information is 
‘new’ in a particular discourse context. This point can be illustrated by reference to the 
Serbian, Croatian, Slovenian and Polish (Pol) translations of a clause from Orwell’s 
Nineteen Eighty-Four —see (6) 5 . None of the four Slavic translations reproduces the 
syntactic structure of the English original, with its passive verb and agentive preposi¬ 
tional phrase, but each preserves the discourse structure of the original to the extent 
that the equivalent of the noun fear —which in each translation is the subject of its 
clause—is placed in the unmarked focus position at the end of the clause. The four 
Slavic versions are also in agreement with each other in placing the adverbial phrase 
‘at every step’, the verb and its subject in that order—constituents 1, 2 and 3 in the 
Cro and Sin versions, constituents 2,3 and 4 in the Ser and Pol versions. The first con¬ 
stituent of the Ser and Pol versions is a resumptive adverbial expression —ipak ‘nev¬ 
ertheless’ and mimo to ‘despite that’—which reinforces the meaning of the preceding 
concessive clause,‘although he had a good pretext for coming here’. (The decision not 
to assign numbered positions to the clitics in these examples merely reflects a desire 
not to prejudge the issue of their syntactic status.) 
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he was haunted at every step by the fear that... 


[Eng] 

ipak ga je na svakom koraku proganjao strah, 

nevertheless him aux.past on every step persecuted fear 

da [Ser] 
that 

A * D 

** 


na svakom koraku progonio ga je strah, 

on every step persecuted him aux.past fear 

- 1 - - 2 - — 3 — 

da 

that 

[Cro] 

ga je pri vsakem koraku preganjal strah, 

him aux.past at every step persecuted fear 

da 

that 

[Sin] 

1 23 

mimo to, nieustannie przesladowafa go mysl, 
despite that unceasingly persecuted him thought 

ze 

that 

[Pol] 




Textual evidence can be used to support, or disprove, certain hypotheses; and nega¬ 
tive evidence is as important as positive evidence. From earlier observations it was 
clear that there is a higher incidence of complex constituents interrupted by clitics in 
Cro than in Ser, and that Sin yields no such examples. Thus with regard to the transi¬ 
tion between stages B and C of (1), Sin has apparently proceeded all the way, Ser has 
proceeded less far, and Cro is the most conservative of the three languages/dialects. 
It had also been previously observed that there is a strong tendency for main verbs 
to occur earlier in clauses in Sin than in S-Cr (Bennett 1986:13-14,1987:277-78). On 
the basis of these observations, Bennett hypothesized that Sin might already be closer 
to the verb-clitic stage (stage D) than S-Cr and that, within S-Cr, Ser might have pro¬ 
ceeded further than Cro (1990:1311, 2000:453-54). In the Nineteen Eighty-Four texts, 
Table 1 indicates, for each language, what proportion of the clitics immediately pre¬ 
ceded the main verb, what proportion immediately followed the main verb, and what 
proportion were separated from the main verb. (The corresponding Pol figures are 
included even though they are not relevant to the point at issue.) 

From the point of view of proximity to the verb-clitic stage, we are interested in 
the combined percentages for clitics adjacent to the main verb, i.e., the sum of the first 
two columns: Sin - 72%; Ser - 70%; and Cro - 73%. The differences between these per¬ 
centages are presumably not statistically significant, and in any case do not reflect the 
predicted order of Sin, Ser, Cro. The hypothesis was therefore not supported. More¬ 
over, the textual data enable us to see why the hypothesis was ill-conceived (Bennett 
2000:460). Admittedly, where clitics follow the first clause-constituent in each of Sin, 
Ser and Cro, the possibility of the clitics and main verb being adjacent depends on the 
possibility of the main verb occurring earlier, and is therefore higher in Sin than in 
S-Cr. There is indeed no shortage of examples such as (7): 
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Language 

Clitic(s) + MV 

MV + Clitic(s) 

Non-adjacent 

Slovenian 

61% 

11% 

28% 

Serbian 

50% 

20% 

30% 

Croatian 

41% 

32% 

27% 

Polish 

11% 

82% 

7% 


Table i. Proportion of clitics immediately preceding, immediately following or 
separated from their main verb, based on approximately 400 examples in Slovenian, 
Serbian and Croatian and 153 examples in Polish 6 . 

(7) a. O’Brien je imel v rokah kos papirja [Sin] 

O’Brien aux.past had in hands piece of-paper 

b. O’Brien je medu prstima drzao komadic papira [Cro] 

O’Brien aux.past between fingers held scrap of-paper 

‘O’Brien had a slip of paper between his fingers’ 

What was overlooked was that when clitics occur later than after the first constituent 
of a clause, which is frequently the case in Ser and Cro, this may result in the main 
verbs and clitics being adjacent in S-Cr but separated in Sin. Relevant examples are 
provided in (8) and (9): 

(8) a. Iz neznanega razloga je vedno mislil, da... [Sin] 

from unknown reason aux.past always thought that 

b. Zbog nekog razloga uvijek je mislio da... [Cro] 

for some reason always aux.past thought that 

‘For some reason he had always thought that...’ 

(9) a. Spodaj na ulici so vrtinci vetra sukali prah in... [Sin] 

down on street aux.past eddies of-wind twisted dust and 

b. Dolje na ulici, mali virovi vjetra vrtljeli su prasinui... [Cro] 

down on street little eddies of-wind turned aux.past dust and 

‘Down in the street little eddies of wind were whirling dust and...’ 

To judge from the figures in Table 1, it would seem that the two separate tendencies 
illustrated by (7), on the one hand, and (8) and (9), on the other, cancel each other out. 
The value of using corpus-based data in this case, therefore, was twofold: on the one 
hand it revealed that a particular hypothesis was ill-founded and on the other hand it 
revealed why the hypothesis was ill-founded. 

Examples (8)b and (9)b, and also the Sin, Ser and Cro versions of example (6), 
allow us to mention the topic of deviations from a strict application of P2, which is 
not merely important for an understanding of the Sin and S-Cr clitic systems but will 
also be relevant in the discussion of Pol in section 2. Sin deviates from P2 in allow¬ 
ing pi occurrences of clitics—see, for instance, the Sin version of (6), where the clitic 
cluster is in initial position in the main clause, following a subordinate clause. For 
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such an order of elements to be possible depends on the Sin clitics being prosodically 
neutral, i.e., able to occur either as enclitics or as proclitics (Toporisic 1976:58, 535). 
In S-Cr, and also in Pol, the clitics with which we are concerned are strictly enclitic. 
Thus clause-initial clitics are not possible in S-Cr, since this would presuppose the 
possibility of proclitic uses. The deviation from P2 that is encountered in S-Cr consists 
in the clitics occurring at P3 or even P4. Occurrence at P3 is seen in (8)b: the clause- 
initial constituent ( Zbog nekog razloga ‘For some reason’) is complex and likely to be 
followed by an intonational break and/or pause. The clitic is therefore attached not 
to the complex first constituent but to the short second constituent ( uvijek ‘always’). 
In (9)b each of the first two constituents is complex, i.e., Dolje na ulici ‘Down in the 
street’ and mali virovi vjetra ‘little eddies of wind’. In this case the clitic is attached to 
the third constituent—the participial verb vrtljeli ‘turned’—and therefore occurs at P4 
rather than P2. The Croatian version of example (6) provides a further instance of P3 
placement, i.e., the clitics are attached to the short second constituent progonio ‘perse¬ 
cuted’ rather than the more complex first constituent na svakom koraku ‘at every step’. 
A further possibility in either Ser or Cro would have been to place the participial verb 
proganjaolprogonio ‘persecuted’ in first position in the clause and attach the clitics to 
this. Neither the Ser nor the Cro translator opted for this particular strategy in this 
instance. What the Ser translator did was insert a short word at the beginning of the 
clause that has no counterpart in the English original, i.e., the adverb ipak ‘neverthe¬ 
less’, which reinforces the meaning of the preceding concessive clause ‘although he 
had a good pretext for coming here’; the clitics are attached, in P2, to this extra word. 

1.3 formal and functional approaches. Franks and King (2000) discuss clitics 
using the Minimalist version of formal generative grammar, and distinguish between 
approaches which are essentially syntactic, those which are essentially phonological, 
and those which combine syntax and phonology. They themselves favor a mixed 
approach which is mainly syntactic. As for S-Cr examples such as (10), Franks and 
King contrast them (2000:220) with sentences such as (11): 

(10) Anina im sestra nudi cokoladu 

Ana’s to-them sister offers chocolate 

‘Ana’s sister is offering them chocolate’ 

(11) Anina dolazi sestra 

Ana’s comes sister 

‘Ana’s sister is coming’ 

They claim that in general the complex constituents in examples such as (10) can also 
be interrupted by non-clitic constituents, as in (11). They conclude that the mecha¬ 
nism involved is syntactic rather than phonological. Another demarcation issue that 
they discuss involves the order in which individual items occur in a clitic cluster in 
particular languages. They reject the approach that posits a separate morphological 
template for each language, on the grounds that such an approach treats the facts of 


[S-Cr] 

[S-Cr] 
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each language as arbitrary. Instead they favor a syntactic treatment, which they claim 
is better able to capture cross-linguistic generalizations (2000:320-23). Moreover, 
they claim that the same kinds of syntactic generalizations apply across languages 
with verb-adjacent clitics and languages with P2 clitics and use the same mechanism 
for ordering the individual clitics in each case. Specifically, all the clitics occur in the 
head position of various kinds of functional phrases: interrogative li, in languages 
that have it, occupies the highest functional head position, C°; auxiliaries (except 3rd 
personae and the Sin future auxiliary bom, bos, bo, etc.) occupy the subject agreement 
slot (AgrS 0 ); dative clitics occupy an indirect object agreement slot (AgrIO 0 ) and 
accusative clitics a direct object agreement slot (AgrO 0 ); reflexive pronouns are the 
head of a reflexive phrase (Ref®); and je and the Sin future auxiliaries occur as head 
of a tense phrase (T°). In languages with verb-adjacent clitics, the clitics are base¬ 
generated in the above-mentioned positions, and the auxiliaries and pronominal 
clitics are regarded as pure agreement markers, or ‘nonargument clitics’ (2000:318). 
Adjacency between the verb and the clitics is achieved by the verb moving to be adja¬ 
cent to the clitics. In P2 clitic languages, on the other hand, the pronominal clitics are 
base-generated as arguments of the verb and then move into the various agreement 
phrases—‘for case-checking reasons’ (Franks & King 2000:318)—irrespective of what 
happens to the verb. 

As for functional treatments, Delbriick (1900:49, 51) distinguished P2 systems— 
which for him were only word-based rather than constituent-based—from verb-clitic 
systems by claiming that in the former the clitics are positioned on a ‘rhythmic- 
musical’ (i.e., phonological) basis, whereas in the latter the clitics are attached to the 
word to which they are most closely related semantically (‘ihrem Sinne nach’). How¬ 
ever, it is misleading to think simply in terms of phonologically vs. semantically orga¬ 
nized systems, since phonological and semantic factors are involved in both kinds of 
system. The reduced phonological prominence of clitics of either type—i.e., their lack 
of stress and the fact that in some cases they are shorter versions of stressed coun¬ 
terparts—is a reflection of their reduced semantic salience. Alternatively, we should 
perhaps speak of their reduced informational salience: pronouns, for instance, refer 
to entities which are contextually or situationally‘given; and the meaning of particles 
with a connective function is often predictable from the content of the clauses that 
they connect (Delbriick 1900:48). Clitic auxiliaries that are used to form particular 
tenses—and indeed tense inflections, too—are similar to pronouns in that they have 
an antecedent in the form of a temporal adverbial which gives more specific informa¬ 
tion about the time in question. What unites these three kinds of clitics when they 
occur in a single clitic cluster at P2 seems to be precisely their reduced informational 
and phonological prominence. In other words, the occurrence of the clitics at or near 
the beginning of a clause reflects their ‘thematic’ nature, in the Hallidayan or Prague- 
school sense of the term. Sin example (3b), discussed in section 1.1, was invented for 
the purpose of contrasting Sin and S-Cr (especially Cro) on one particular issue. The 
actual sentence which occurred in the Sin translation of Nineteen Eighty-Four was: 
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(12) [Ce ne,] vam bo slovar izrocil moj sluzabnik [Sin] 

if not to-you aux.future dictionary hand-over my servant 
‘[If not,] my servant will give you the dictionary’ 

(I have added emphasis to the constituent in the English original that would be the 
obvious candidate for tonic stress in the discourse context in which the sentence 
occurs.) The main clause of the Sin sentence is a classic example of a clause which 
proceeds from what is maximally thematic and ‘given’ to what is maximally rhematic 
and‘new’—or, alternatively, which exhibits a progressively increasing level of'commu- 
nicative dynamism’. What prevents the clitics from occurring in clause-initial position 
in the majority of the P2-clitic languages is the fact that they are specifically enclitic 
and have to ‘lean on an accented word to their left. The shift from a P2-clitic system 
to a verb-clitic system can perhaps profitably be thought of as involving a change 
from discourse-oriented positioning to semantically-oriented positioning. But even 
in a primarily discourse-oriented system, semantic relatedness of particular clitics 
to (e.g.) the verb is already a relevant factor. Thus although Proto-Slavic is assumed to 
represent the P2 stage in general (Meillet 1934:481-83, Jakobson 1971:16-18), the (accu¬ 
sative) reflexive clitic of the two oldest Slavic languages. Old Church Slavic and Old 
Russian, frequently occurs immediately after the verb rather than at P2 (Ard 1975: 
96-97, quoting Havranek 1963:22). Stone makes a similar point about Old Church 
Slavic (1996:216), referring to Vecerka (1989:47-48). Thus some clitics appear to be 
affected by conflicting pressures—on the one hand, the pressure to congregate with 
other informationally and phonologically non-prominent items near the beginning 
of a clause and, on the other hand, the pressure to be attached to the word to which 
they are most closely related semantically, normally the verb. That, within a basically 
P2 system, it is specifically reflexive pronouns that are more likely to be attached to the 
verb than other pronouns is related to the fact that reflexive verbs are often equivalent, 
semantically, to middle voice verbs or intransitive verbs (Ard 1975:112). Change from 
the one type of system to the other would depend on a gradual shift in the magni¬ 
tude of the two pressures affecting particular clitics, though it is likely that the inter¬ 
rogative clitic li and various clause-connective clitics would not be sufficiently closely 
related semantically to the verb for them ever to become verb-clitics. 

2. polish: a case of change in progress. According to Spencer (1991:367), the Pol 
clitic system is ‘unlike that of other Slav languages in a number of respects’. Moreover, 
whereas S-Cr and Sin are P2-clitic languages and Bulgarian and Macedonian are verb- 
clitic languages, in Pol ‘the distribution is determined largely by phrase-level prosodic 
considerations’ (1991:367, interpreting the account of De Bray 1980:326-28). Among 
the specific idiosyncrasies of Pol that Spencer mentions are that ‘it is possible to break 
up strings of clitics’ and that the auxiliary verb ‘clitics’ are ambiguous as between verb 
affixes, on the one hand, and clitics capable of being attached to any host, on the other 
(1991:369-73). Franks and King also regard Pol as constituting a separate category of 
language from those with ‘verb-adjacent clitics’ and those with ‘second-position clitics’ 
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(2000:295-305). And they, too, illustrate the possibility of individual clitics occurring 
separated rather than in a cluster (2000:338). As for the auxiliary clitics, Franks and 
King point out that they are more like inflections than clitics (2000:298), and indeed 
that they are changing from clitics to inflectional affixes (2000:269). The other respect 
in which Pol clitics have changed, according to Franks and King (2000:298, 338), is 
that the pronominal clitics, which used to be ‘special clitics’ (Zwicky 1977), i.e., both 
prosodically and syntactically idiosyncratic, have now become ‘simple clitics’, i.e., 
there are (allegedly) no longer any syntactic restrictions on their occurrence. 

Andersen’s (1987) account of the Pol auxiliary clitics/verb inflections, reporting on 
the work of Rittel (1975), points out that over the last five hundred years Pol clitic 
usage has gradually undergone a change from a Wackernagel-type P2 system toward 
a verb-clitic system and the present situation, where in many cases the clitics have 
already become verb-suffixes. At the beginning of this period the Wackernagel prin¬ 
ciple was already beginning to be relaxed, but even at the present time it is still 
residually in effect. The penultimate-syllable stress of Pol provides one kind of evi¬ 
dence whether or not particular clitics have been incorporated into the verb as affixes. 
Northern dialects have proceeded further in this respect than southern dialects and 
the standard language represents an intermediate situation. In northern dialects the 
addition of any of the ‘auxiliary clitics’ (singular or plural) effects a change in the stress 
pattern of the resulting formation, which suggests that they have become affixes. In 
some southern dialects, on the other hand, all of the clitics (singular and plural) seem 
to be more loosely attached to the verb stem, in that they have no effect on the stress 
pattern of the resulting formation. The process of absorption of the clitics into the 
verb as affixes has affected the singular forms before the plural forms for phonological 
reasons—the longer plural clitics had a greater degree of autonomy than the shorter 
singular forms. The standard language represents the half-way stage at which the sin¬ 
gular clitics have been absorbed as affixes and the plural suffixes are still independent 
of the stem in terms of its stress pattern. 

With regard to the gradual process of deviation from the Wackernagel principle 
over the last five hundred years, Andersen reports Rittel’s (1975) findings in the fol¬ 
lowing terms (1987:44): 

One type of deviation is placement after an initial phrase (rather than the first 
word), which safeguards the adjacency of its constituents. Another type is place¬ 
ment after the first word or phrase that follows an intonational caesura, which 
helps set off the thematic elements that precede it. Yet another is placement after a 
word that carries emphatic stress. 

These types of deviations from strict P2 placement are specific cases of the factors that 
Spencer refers to generally as ‘phrase level prosodic considerations’ (1991:367-69). 
They are also reminiscent of the ways in which modern S-Cr deviates from strict P2 
placement that were discussed at the end of section 1.2. (Pol is also like S-Cr insofar 
as its clitics are strictly enclitic.) In the following section we shall see to what extent 
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our textual data for Pol are compatible with Andersens account of the stage that Pol 
has reached. 

3. textual data for polish. The first thing to note in relation to the Pol data is that 
all the examples of clitics can be interpreted either as P2 clitics or as verb clitics/affixes. 
This fact is consistent with the suggestion that present-day Pol is in the process of 
changing from a P2 system to a verb-clitic system (and beyond). 7 Some examples are 
unambiguously P2 clitics, e.g., was ‘you (plur.)’ in (13), and some can perhaps only be 
analyzed as verb clitics, e.g., by‘would’ in (14): 

(13) Gdy was w koncu zlapi^,... 
when you in end will-catch 
‘When they catch you in the end,...’ 

(14) W takim wypadku zmianie uleglafiy jego twarz, 

in such event to-change succumb-would his face 

‘In such an event, his face.. .would undergo change’ 

Given that the clitic in (14) is an auxiliary adjacent to the main verb, there seems 
little point in considering whether it could also involve one of the regularly attested 
deviations from P2 order. On the other hand, since it seems likely that pronominal 
ich ‘them’ in (15) follows the first constituent after an intonational caesura, it could 
perhaps reasonably be analyzed as being both a verb-clitic and a somewhat liberally 
defined P2 clitic. Other examples clearly illustrate simultaneous P2 and verb-clitic 
positioning without any need to invoke attested deviations from strict P2 placement, 
e.g., was ‘you (plur.)’ in (16). 

(15) W ten sam sposob rozpoznal ich w chwili, gdy... 

in that same way he-recognized them in moment when 

‘He had identified them in that same way at the moment when...’ 

(16) Zapewniam was, ze Braterstwo istnieje 

I-assure you that Brotherhood exists 

‘I assure you that the Brotherhood exists’ 

The 5000-6000 words of text analyzed so far have revealed no cases of split clitic 
clusters. However, the fact that Pol clitics occur either at P2 or adjacent to the verb 
predicts such a possibility, and further analysis may reveal relevant examples. Alterna¬ 
tively, such structures may be generally absent from the personal style of the transla¬ 
tor of Nineteen Eighty-Four. 

Two further things are of interest in relation to the Pol data. First, P2 placement 
of Pol clitics is more common in subordinate clauses than in main clauses. This is 
consistent with the observation that, in all cases of word order change in progress, 
subordinate clauses exhibit a more conservative word order than main clauses. For 
Pol, Andersen (1987:29,35) reports the findings of Rittel (1975), that main clauses have 
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been changing ahead of subordinate clauses over the whole period during which 
the shift from P2 positioning to verb-clitic positioning has been taking place. With 
regard to the conditional formative by‘would’, Decaux observes that in main clauses it 
tends to follow the -l participle but that in a subordinate clause introduced by a con¬ 
junction it is obligatorily attached to the conjunction (1955:24). This, too, is borne out 
by our data; (14) can be contrasted, in this connection, with (17). 

(17) jakby go ktos zdzielil palk^ gumow^ po ciemieniu [Pol] 

as-would him someone imparted stick rubber on temple 

‘as if someone imparted him [a blow] with a rubber truncheon’ 

(18) Jesliby cos powiedzial [Pol] 

if-would something he-said 

‘If he should say something’ 

It may be that in cases like (17) two separate factors are at work: on the one hand, 
the position after the conjunction involves the original Wackernagel positioning, 
which persists longer in subordinate clauses; and, on the other hand, the conjunc¬ 
tion—rather than the verb—may be precisely the constituent of the clause to which 
the conditional formative bears the closest semantic relationship. This certainly seems 
plausible in an example such as (18), where there is an obvious semantic relationship 
between the conjunction jesli'ii’ and the conditional meaning of by‘would’. 

Finally, and more significantly, there are restrictions on the positions in which indi¬ 
vidual clitics may occur. Thus, with regard to auxiliary clitics, although most discus¬ 
sions of Pol illustrate one or more possibilities of the more conservative Wackernagel 
positioning—e.g., (i9)b-d—in addition to the verb-clitic positioning (i9)a, the Polish 
translator of Nineteen Eighty-Four attached auxiliary clitics exclusively to the verb 8 . 

(19) a. Ciekawij ksi^zkf kupila-s Janowi [Pol] 

interesting book bought-2sg. for-John 

‘You bought an interesting book for John’ 

b. Ciekawij-s ksi^zkf kupila Janowi 

c. Ciekawij ksi^zkf-s kupila Janowi 

d. Ciekaw:} ksi^zkf Janowi-s kupila 

Also, as might be expected in the light of our earlier discussion, the reflexive pronoun 
sif occurs only as a verb clitic. As for other pronominal clitics, the accusative forms 
were attested in either position, but all of the dative clitics occurred exclusively as verb 
clitics. Other discussions of Pol clitics certainly illustrate P2 occurrences of dative clit¬ 
ics, e.g., Spencer (1991:369). It would seem, then, that we may again be dealing here 
with the personal style of the translator of Nineteen Eighty-Four. Nevertheless, it is 
possible that there is a general tendency in the language for dative clitics to gravitate 
to the verb ahead of accusative clitics. 
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4. formalizing the change from P2 clitics to verb-clitics. Franks and King 
present the following ‘diachronic scenario’ for the change from the P2 system of‘Older 
Bulgarian to the verb-clitic system of the present-day language (2000:318). Slavic NPs 
were, and in most cases still are, embedded in a KP (case phrase),but in Bulgarian (and 
also Macedonian) the loss of case and the introduction of articles have resulted in the 
KP giving way to a DP (determiner phrase). Moreover, the K° morpheme,‘instead of 
moving to Agr (as an argument for case-checking reasons)... was reanalyzed as being 
base-generated in Agr’. This scenario was not invoked by Franks and King in relation 
to the changes going on in Polish, since (as pointed out in section 2) they do not see 
these changes as involving an ongoing shift from P2 clitics to verb-clitics. However, on 
the basis of the work of Rittel (1975) and Andersen (1987), as reported in section 2, and 
my own textual analysis of section 3, 1 have argued that Polish has been undergoing 
just such a change, coupled with a further change of verb-clitics to verb-inflections. 
It is relevant, therefore, to ask to what extent the same ‘diachronic scenario’ might fit 
the facts of the Polish case; and the answer is: not at all well, and for two reasons. First, 
Polish has not lost the category of case or developed articles. Thus the development of 
verb-clitics by no means necessarily correlates with the rise of articles. Secondly, the 
term ‘reanalysis’ implies that word order change as such was not involved in the Bul¬ 
garian case. As for Polish, the phenomenon of split clitic clusters and the main body 
of our textual data, which indicates that particular clitics favor the position adjacent 
to the verb whereas others still regularly occur at P2, strongly suggest not a reanalysis 
but an ongoing word-order change, with certain clitics being attracted to the newer 
position ahead of others. 

In any case, the Minimalist framework is by no means the only framework in 
which to formalize the change with which we have been concerned. In recent years 
there have been increasing numbers of applications of Optimality Theory (OT) to the 
topic of clitics, e.g., Anderson (20oo),Tegendre (2000) and Vincent (2001). Moreover, 
in sections 1.3 and 3 it was suggested that Polish clitics have been subjected to two dif¬ 
ferent pressures—on the one hand, the pressure to occur early in the clause for rea¬ 
sons of discourse structure and,ontheotherhand,thepressuretooccuradjacentto 

the constituent to which they are most closely related semantically. There is thus com¬ 
petition between the different positions in which the clitics may occur, and it was 
suggested that over time the strength of the two pressures has changed. A further 
promising framework to explore, therefore, is that of network grammars of the strati- 
ficational, and now ‘neurocognitive’, variety (Lamb 1999), since the notion of compe¬ 
tition has occupied a prominent place within spreading activation implementations 
of these grammars for more than twenty years, e.g., Dell & Reich (1980), Lamb 
(1999:218-21,233—36) 9 . 

5. conclusion. This paper began with the suggestion that clitics, by their nature, need 
to be examined diachronically; and chose to focus on Polish, within Slavic, since it rep¬ 
resents a particularly interesting case of change-in-progress. The textual data that were 
used in the paper provided evidence for various sorts of conclusions. The Sin, Ser and 
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Cro data demonstrated, first, that none of these three languages/dialects is any closer to 
reaching the verb-clitic stage than the other two and provided an explanation why an 
earlier hypothesis on this issue was ill-founded. At the same time, the data allowed us 
to see in what ways Sin, Ser and Cro clitics deviate from a strict application of P2 place¬ 
ment; and the deviation facts for Ser and Cro, in turn, mirrored the observations of 
Rittel (1975) and Andersen (1987) concerning the gradual loss of P2 placement by Polish. 
The Polish data supported the view that the clitics of this language are still in the process 
of changing from P2 to verb-clitic placement, and that the development in question has 
involved a word-order change rather than a structural reanalysis. The paper also argued 
that adequate attention needs to be paid to both formal and functional considerations. 
From a functional point of view, it was suggested that the change that Polish has been 
undergoing can best be understood as a gradual shift from discourse-oriented position¬ 
ing of its clitics to semantically-oriented positioning, with individual clitics differen¬ 
tially affected by the two separate pressures at any given time and the change from one 
type of system to the other depending on a gradual shift in the magnitudes of the two 
pressures. Finally, doubts were raised over the applicability of a particular Minimalist 
formalization of related diachronic phenomena. 
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I am also pleased to acknowledge the help of Harry Leeming, co-author of Bennett and 
Leeming (1996), and Monica Leeming, in relation to the discussion of Polish; and that of 
Simona Bennett in relation to Slovenian and Serbo-Croatian examples. Finally, I am grate¬ 
ful for comments on an earlier version of the paper by Thea Bynon and two anonymous 
reviewers. 

The clitics with which we shall be concerned correspond to independent words from 
a number of syntactic categories, such as pronouns, auxiliary verbs, conjunctions and 
modal particles. They are specifically enclitics, though in, e.g., Slovenian and Macedonian 
they may occur also as proclitics. The paper is not concerned with such specifically pro¬ 
clitic items as short unstressed prepositions and the marker of negation. 

Examples (2)a and (2)b are taken from the Croatian (Cro) and Serbian (Ser) translations, 
respectively, of George Orwell’s novel Nineteen Eighty-Four. Interruption of a clause-ini¬ 
tial complex constituent can be encountered also in Ser but is less common than in Cro. 
The preposition v‘in in this example is proclitic to omou (etoj) ‘this’. Thus the first accented 
word in the embedded clause is omou {etoj). 

Between 5000 and 6000 words of the four Slavic texts were analysed. Roughly half of the 
chosen material consists of predominantly past tense narrative, while the remainder rep¬ 
resents predominantly present tense dialogue. 

The reason the Pol text has so few clitics, by comparison with the Sin, Ser and Cro texts, is 
that the 3rd person forms of the auxiliary (both singular and plural) are null. 

This observation contrasts with Franks and King’s (2000:298, 338) suggestion, reported 
in section 2, that the Pol clitics have become simple clitics capable of occurring in any 
syntactic position. 

Example (19) is adapted from Franks and King (2000:158). 
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Limitations of space in the present paper prevent me from comparing the three above 
formal theories—Minimalism, OT and neurocognitive networks—by applying them to 
the same diachronic data. However, I hope to remedy this situation in the near future in 
a sequel to this paper. 
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THE MORE THINGS CHANGE: 
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David Bowie 
Brigham Young University 


at their core, historical linguistics and much of sociolinguistics study the same 
phenomenon: language change over time. Because of this focus, both of these sub¬ 
fields continually come up against the ‘actuation problem’ or the question of what 
exactly causes any particular linguistic change at any particular time in any particular 
speech community. Although this is an important question for studies of language 
change, there is some question as to whether it can be solved in the sense of develop¬ 
ing what might be called principles of actuation 1 . 

Unfortunately, in attempts to solve the actuation problem, some of the work that 
has been done is based on the perpetuation of conventional wisdom rather than 
rigorous original scholarship. For example, the claim that Latinate usages in English 
texts as late as the fourteenth century was the result of French influence left over from 
the Norman Conquest has been widely repeated in the academic literature, but this 
has been shown to be false, since the Norman population was fairly quickly assimi¬ 
lated linguistically (Clark 1992). There are, however, a few cases in which the actuation 
question has been solved for particular changes in particular speech communities, 
and the solutions hold up under close scrutiny. The two this paper looks at most 
directly are the reduction of monophthongal /ay/ in urban Texas (Thomas 1997) and 
the merger of /a/ and hi in eastern Pennsylvania (Herold 1997). These are particularly 
interesting, because both of these solutions to the actuation problem have found the 
same underlying cause: large-scale immigration creating a dialect leveling situation. 
Given that solutions to the actuation problem have been found in these cases, and the 
solutions are so similar, a possibly more important question has emerged: can solu¬ 
tions to the actuation problem for a particular change in a particular speech commu¬ 
nity be extended to similar changes in other speech communities? This paper seeks 
to answer this question. 

1. solutions to the actuation problem. To set the stage I need to briefly review 
the two solutions to the actuation problem I referred to earlier. First I will sketch 
out the process by which /a/ and hi merged in eastern Pennsylvania, and then I will 
discuss the reduction in rates of monophthongal /ay/ in urban Texas. 

1.1. THE MERGER OF la/ AND hi IN EASTERN PENNSYLVANIA. Herold (1997) noted that 
certain communities in eastern Pennsylvania show a tendency toward the cot-caught 
merger, and the evidence shows that the merger developed in eastern Pennsylvania 
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independently of contact from other varieties where the merger exists. Also, perhaps 
more interestingly, the status of the merger is different in different cities, depending 
on whether they are historically coal-mining communities (generally merged) or not 
(generally unmerged); of the twelve coal-mining communities studied, ten show the 
merger, and all eight non-mining communities do not exhibit the merger (Herold 
1997: Figure 7). This difference between mining and other communities points to 
the possibility of some sort of demographic or social difference between them. After 
studying the various ways in which the mining and non-mining communities differ, 
Herold found a significant difference in early twentieth century immigration—the 
mining towns attracted a significantly larger number of Lithuanian and Polish immi¬ 
grants. From this, Herold concluded that the merger of la/ and hi in the mining 
towns of eastern Pennsylvania was caused by contact with a massive influx of speak¬ 
ers who, because their first language had no such distinction in the oral vowels, put 
pressure on the rest of the speech community to abandon the distinctions. On the 
other hand, the number of speakers without that distinction who came to the non¬ 
mining towns was much smaller. Therefore, they did not place as much pressure on 
the local varieties there, and the distinction survived 2 . 

1.2. monophthongal /ay/ in urban Texas. Thomas’ study of /ay/ in Texas (1997) 
deals with a somewhat different sort of sound change—the reversal of a trend. 
Thomas found a shift away from monophthongal /ay/ among native Texan Anglos, 
but found that this trend was limited to large urban areas, with rural and smaller 
urban areas continuing to favor the monophthongal form 3 . Once again, this sort of 
result raises a clear question: Is there a social or demographic difference between 
residents of very large cities and others? Thomas investigated further and found that 
there is, indeed, a difference—residents of very large urban areas were surrounded 
by a larger proportion of relatively recent immigrants to the state of Texas. Specifi¬ 
cally, 19.1% of residents of very large metro areas at the time of the study had lived 
there ten years or less, compared with only 14.8% and 10.5% of residents of smaller 
metro and town/rural areas, respectively (Thomas 1997: Table 4). As recent migration 
to Sunbelt states like Texas has come largely from non-Southern areas of the United 
States, Thomas concluded that there was enough dialect contact between Texans in 
large urban areas and non-Texans without monophthongal /ay/ to put pressure on the 
varieties used in the local speech communities. 

1.3. the common thread. A common thread links Herolds (1997) and Thomas’ (1997) 
studies—in each case, the cause of the sound change under investigation was found to 
result from large-scale immigration putting pressure on local varieties to bring about 
a change. These are fairly solid solutions to the actuation problem, and one might ask 
whether this could be a general principle of actuation. That is, if large-scale immigra¬ 
tion and the resulting dialect contact can cause linguistic changes, can we say that all 
(or at least most) changes are caused by immigration and dialect contact? Along the 
same lines, can solutions to the actuation problem for particular changes in particular 
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speech communities like the ones described above be extended to similar changes in 
different speech communities? 

2. parallel studies. To determine whether it is possible to say that we can develop 
principles of actuation from the observations described above, I present two cases 
of changes that are at least roughly parallel: the merger of la/ and hi in Utah (Di 
Paolo 1992; Bowie et al. 2001) and monophthongal /ay/ in Southern Maryland (Bowie 
2001a). These changes will be compared directly with their equivalents, which will 
give insight into whether we can claim that the causes are the same in each case. 

2.1. the merger of /a/ and hi in utah. That la/ and hi are merged in Utah (along 
with most of the United States West) is widely enough known that the fact rarely 
gets more than a passing mention, if that, by most descriptions of the local varieties 
(Labov et al. in press). However, studies have been conducted that deal with the status 
of this merger in some depth both in modern times (Di Paolo 1992) as well as the 
nineteenth century (Bowie et al. 2001). Preliminary results from the Early Utah Eng¬ 
lish Project on the development of Utah English during the first half-century of its 
existence (i 847-’96), looked at this merger and found that it first appears appreciably 
among individuals born in the 1870s. To this point, there seems to be a parallel with 
the context for the cot-caught merger in eastern Pennsylvania as described by Herold 
(1997), particularly since there was quite a bit of immigration to the Utah Territory 
around that time. The 1880 census shows 30.54% of Utah’s population as born outside 
of the United States. The parallel breaks down fairly quickly after that, though, as the 
nature of the immigration was quite different. A review of place of birth as reported 
in the 1880 census shows that less than 8% of the residents of the Utah Territory were 
born in countries with widely spoken languages that lack something analogous to 
the lal-hl distinction. Further, less than 13% of Utah residents were born in coun¬ 
tries in which languages other than English were spoken at the time. This is a very 
small percentage of total immigrants when compared to the proportion of Polish and 
Lithuanian immigrants that Herold found in the cities with the cot-caught merger 
in eastern Pennsylvania, particularly when one considers that that there were other 
groups of immigrants to eastern Pennsylvania who could also have placed pressure 
on the local variety to simplify its vowel system. Although one would expect that 
the coming together of speakers of a variety of dialects of English must have given 
rise to a dialect leveling situation, dialect leveling situations seem to function dif¬ 
ferently from language contact situations (compare, for example, the conclusions of 
Kerswill & Williams 2000 and Thomason & Kaufman 1988). Given this, it is clear that 
even though the situations in eastern Pennsylvania and Utah led to the same merger, 
the actual mechanics that would have resulted from the different situations must have 
differed 4 . This means that if we see the same linguistic change occurring in two dif¬ 
ferent speech communities, we cannot automatically assume that the causes are the 
same. This is perhaps only common sense, but it is still necessary evidence regarding 
the possible extension of solutions to the actuation problem. 
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year of birth 

Figure 1. /ay /-monophthongization among Waldorfians by year of birth (Bowie 2001a, 
Figure 3 ). 

2.2. monophthongal /ay/ in southern Maryland. A comparison of the regres¬ 
sion of monophthongal /ay/ in Southern Maryland (Bowie 2001a) and in urban Texas 
(Thomas 1997) sheds light on different, and perhaps more important, issues than does 
the low back vowel merger discussed above. Waldorf, a town in Southern Maryland, 
has experienced a reduction in rates of /ay/-monophthongization during the twen¬ 
tieth century similar to that reported in urban Texas. In Waldorf, the abandonment 
of monophthongal /ay/ appears to be a change in progress, with women leading the 
change; a graph showing the decline of /ay/-monophthongization in apparent time 
among twenty-five white middle-class Waldorfians (fourteen female, eleven male) is 
shown in Figure 1. The decrease over apparent time provides a good fit to an expo¬ 
nential curve, with R 2 =o.7734 for the women and R 2 =o.9593 for the men. Multivariate 
analysis confirms that this apparent time effect exists and also that it has an extremely 
strong effect. In addition, it is important to note that Waldorf’s population increased 
faster than national and state rates for population increase for much of the twentieth 
century, as shown in Figure 2 5 . 

At first glance, it might seem that the same process has occurred in Waldorf as 
in urban Texas—Waldorf has also experienced massive immigration (primarily 
from elsewhere in the United States), necessarily leading to a high level of dialect 
contact. It could seem then, that the reduction in the rate of /ay/-monophthongiza- 
tion over time in Waldorf is due to this high level of dialect contact, just as it was in 
urban Texas. A closer look at the demographic situation in Waldorf as it relates to 
this change, however, shows that this cannot be the case. The population figures in 
Figure 2 show that Waldorf’s population did not begin to climb until the 1940s, and 
the local population did not begin to climb sharply until the 1950s. This stands in 
contrast, however, to the rates of /ay/-monophthongization shown in Figure 1, which 
show a sharp drop in /ay/-monophthongization rates well before the local population 
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year 

Figure 2. United States, Maryland, and Charles County population, 1900-2000 (Bowie 
2001a, Figure 2). 



year of birth 

Figure 3. varbrul weights by year of birth (Bowie 2001a, Figure 7). 

began to climb. Further evidence supporting this observation is shown in Figure 3, 
which shows varbrul weights by year of birth from a larger analysis of this change. 
This factor—which is highly significant, as evidenced by the large distance between 
the factors most and least favoring /ay/-monophthongization—shows the oldest age 
group strongly favoring monophthongal /ay/, the second-oldest age group favoring it 
less strongly, and so on to the youngest age group, which fairly strongly disfavors it. 

As can be seen from the varbrul weights in Figure 3, /ay/-monophthongization 
was declining well before the middle of the twentieth century. Although the group of 
individuals born between 1940 and 1949 favor /ay/-monophthongization, they do so 
less than those born between 1920 and 1939. Similarly, those born between 1920 and 
1939 favor /ay/-monophthongization quite strongly, but less than those born before 
1920. As can be seen from Figure 2, though, large-scale migration into the Waldorf 
area did not start until the 1940s. As a result, an argument based on immigration 
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cannot explain the beginning of the move away from monophthongal /ay/ in the 
Waldorf speech community. It should be noted, though, that by the end of the twen¬ 
tieth century Waldorfians certainly had more contact with speakers of other varieties 
than they did just after World War II. This came about not just because of migration 
into the area, but also because increasing numbers of Waldorfians began commuting 
to Washington d.c. for work in the intervening years (Edelen et al. 1976). This dia¬ 
lect contact may have accelerated the trend toward diphthongal /ay/, or it may have 
caused the variation to tend toward a lower level than it might have otherwise, even 
though it does not appear possible to name it as the cause of the trend 6 . 

3. conclusions and discussion. As mentioned at the beginning of this paper, 
standing in the background in nearly all studies of language change is the actuation 
question: why did this change ever start? Occasionally this question can be answered 
for a particular change in a particular speech community; given this, it seems only 
reasonable to ask: can such solutions for one speech community safely be extended to 
similar changes in other speech communities? 

This paper has compared parallel sound changes in different speech communities 
and, from the evidence presented, it seems clear that the simple answer to that ques¬ 
tion is ‘no’. A closer review, however, shows that there is more to the question than that 
simple answer can provide. 

Eastern Pennsylvania (Herold 1997) and Utah English (Bowie et al. 2001) have 
both experienced the same sound change (the merger of la/ and hi), and urban 
Texas (Thomas 1997) and Southern Maryland (Bowie 2001a) also share a common 
sound change (a trend away from monophthongal /ay/). On the surface, each of these 
pairs appears to have gone through similar demographic changes—specifically, they 
have all experienced large-scale immigration—but upon probing deeper, one finds 
that the specifics of the demographic changes are different. This result is problematic 
for anyone attempting to build a theoretical framework accounting for causes of lin¬ 
guistic change, as it means that what had appeared to be a promising possibility for 
explaining actuation cannot result in a predictive theory. However, the goal of studies 
of language change and actuation should, presumably, lead to the development of a 
truly predictive and explanatory theoretical framework. 

Recent steps toward general principles of this sort have been made in connection 
with studies of dialect leveling (Kerswill & Williams 2000), vowel shifts (Tabov 1994), 
and to some extent variation in language perception and production (Bowie 2001b; 
Niedzielski 2000). However, in general, the move toward a predictive and explanatory 
theory of actuation remains as described by McMahon (1994): Theoretical frame¬ 
works explaining the progress of language changes exist, but a theoretical framework 
for actuation—the beginning of language change—remains as yet out of reach. 

Given this, what should we be looking for so that we can develop a theory of actua¬ 
tion? Obviously, more work along the lines of that done by Herold (1997) and Thomas 
(1997), studying the actuation of particular changes in particular speech communities, 
is necessary—such data is required if we are to test the theory we eventually develop. 
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However, the crucial thing is to look for new patterns in the results that come out of 
such work. The pattern of Herolds and Thomas’s studies — as well as studies such as 
Boberg and Strassel’s work on short-a in Cincinnati (2000)—had shown a possible 
pattern of certain types of migration as triggers for language change. The comparison 
with the facts from the other studies given in this paper shows that that cannot be 
the sole possible trigger for language change. We likely have to move toward a more 
general (if not abstract) set of principles of actuation, along the lines of Kerswill and 
Williams’ (2000) principles of dialect leveling or Labov’s (1994, 2001) principles of 
linguistic change. By focusing on general patterns rather than the specifics of indi¬ 
vidual situations, we can eventually bring sociolinguistics and historical linguistics to 
a point where a theory of actuation is within our grasp 7 . 


Several have noted the importance of this issue. To give just one example, McMahon 
(1994:252) reviewed some of the work that has been done on language variation and its 
relationship to the actuation problem and noted the importance of the issue but stated 
that a general solution ‘remains almost as mysterious and unattainable as ever. 

Herolds (1997) conclusions on why the Polish and Lithuanian immigrants had this effect 
are somewhat tentative and incomplete. Note, though, that for present purposes it is 
unimportant exactly why the Polish and Lithuanian speakers had this effect on the local 
variety; what is important is that they triggered a particular pattern of language contact 
that resulted in the effect. 

A varbrul analysis gave weights of .60 for very large metro areas, .39 for metro areas and 
.43 for towns and rural areas, where higher weights reflect a favoring of the diphthongal 
form (Thomas i997:Table 2). 

Of course, small differences in demography can have disproportionate effects on the local 
speech community. As Mufwene (1996) points out, there is presumably a tipping point at 
which the linguistic impact of immigrants is great enough to seriously affect a local vari¬ 
ety, but the exact level at which the tipping point is reached is not known. If one speech 
community faces immigration slightly above the tipping point and another slightly below 
it, the small difference in immigration levels may be reflected by a change in the first 
community’s variety and little to no change in the second community. 

Waldorf is an unincorporated municipality for which population counts before 1990 are 
not readily available, so census figures for Charles County are used in Figure 2. From 
historians’ discussions of demographic changes in Southern Maryland, this should be 
adequate for present purposes (Potyraj 1994; Edelen et al. 1976). 

It has been pointed out to me that other factors may be at work in Waldorf, such as a pos¬ 
sible perception of‘urbanness’ versus ‘ruralness’ or the like in regard to this change. This is 
a possibility, and certainly needs to be looked at more closely, but for present purposes it 
suffices to point out that the relationship between demographic and linguistic changes in 
Waldorf does not fit the same pattern as in urban Texas. 

An appendix of sorts: this paper has looked at parallel sound changes and whether they 
can be traced to similar demographic causes. A probably more interesting question is 
whether parallel demographic changes result in similar linguistic changes. I suspect that 
the answer to that question would once again be no, but there is as yet no evidence for or 
against that suspicion and it remains an important and open question. 
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EVIDENCE FOR THE IMPERATIVE AS A SPEECH-ACT CATEGORY 


Inga B. Dolinina 
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in this paper i argue that the Imperative should be regarded as a category of speech 
act rather than a verbal category I elaborate a new theoretical framework for this cat¬ 
egorization, and show that the new framework fits the evidence better than the tradi¬ 
tional treatment of the Imperative as a verbal category A crucial part of my argument 
is the legitimacy of so-called ‘mixed paradigms’ which include all person-values and 
in which the Imperative can be marked in some persons synthetically and in other 
persons periphrastically. I show that my argument rests on both primary linguistic 
evidence and theoretical assumptions. 

1. controversies about the imperative. There are many controversies about the 
imperative. Is it a verbal category or a speech act category? That is, is it a Mood clus¬ 
tering with indicative, subjunctive, etc., or a phenomenon of another type clustering 
with declaratives and interrogatives? How are we to describe its grammatical seman¬ 
tics, as a meaning component inside a proposition or one outside it? Is the imperative 
meaning a semantic primitive, or a complex of components? What constructions can 
be recognized as Imperative? What is the scope of the Imperative paradigm? Can it 
include only synthetic forms like (1) or also periphrastic forms like (3), (4), and (5)? 
If it is restricted to synthetic forms, can it include only second-person forms or also 
first- and third-person forms? 

There is no consensus among contemporary linguists on any of these questions. 
As a result, descriptive grammars define the imperative paradigm for a language 
differently. Some grammars include only specialized second-person constructions 
(Mithun 1999, Svedova 1980, Wierzbicka 1995). Others (e.g., those for Greek, Cree, 
Church Slavonic, Celtic languages) include all synthetic forms, irrespective their 
person value (Hopper & Traugott 1993, Jespersen 1992/1924, etc.). Still others include 
values for all persons and allow as members both synthetic and periphrastic 
constructions; such paradigms are called ‘mixed’ paradigms (Khrakovskij 1992). 

2. peculiarities of mixed paradigms. Mixed paradigms are heterogeneous. They 
are expected to include, in the first place, all synthetic constructions represented by 
specialized‘imperative forms’ of the verb (with or without pronouns), which can refer 
to any person value, as in (1). 

(1) a. English Go! (2Sg/pl) 

b. Russian Idi! (2SG); Idite! (2pl) 
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(1) c. German Gehe! (2SG); Geht! (2pl) 

d. Spanish Ve! (2SG); Vayan! (2pl) 

e. French Va! (2SG), Allez! (2pl), Allons nous! (ipl) 

f. Scots Gaelic Seas! (2SG), Seasaibh! (2pl), Seasamaid! (ipl), 

Seasadh esan! (3SG), for ‘stand up’ (Mcaulay 1992:165) 

These paradigms also include diverse non-synthetic, periphrastic constructions. Some 
periphrastic constructions in mixed imperative paradigms have causative or modal 
verbs, as in (2). 

(2) a. English Let’s go! 

b. German Lafi uns gehen! 

c. German Wollen wir gehen! 

Others use forms primarily associated with categorial meaning other than impera¬ 
tive, e.g., Subjunctive, Future or Present Indicative, which are often combined with a 
subordinator, as in (3). 

(3) a. French Qu’ils aillent! (3PL) ‘Let them go!’ 

b. Spanish Que nostros vayamos! (ipl) ‘Let us go!’ 

Others use delexicalized verbs or auxiliaries, some of which are reduced to particles, 
as in (4). 


(4) Russian Davaj/te (2Sg/pl) pojdem (ipl)! ‘Let us go!’ 

Pust’on idet! (3SG) ‘Let him go!’ 

In contrast to standard verbal paradigms with a permanent categorial marker 
throughout person/number oppositions, ‘mixed’ paradigms characteristically change 
the encoding of Imperative with the shift in person/number. Compare the examples 
from in English (5), where 2Sg/pl is a specialized synthetic form and first- and third- 
persons are identical periphrastic forms (the lexeme of the content verb is under¬ 
lined, the formants are bolded), with the Russian examples in (6), where all three 
person-values are marked differently: second-person synthetically, first- and third- 
persons as different periphrastic constructions. 


(5) a. isg/pl: 

b. 2 Sg/pl: 

c. 3 Sg/pl: 


Let me/us go! 

Go -( 0 )! 

Let him/them go! 


(6) a. isg/pl Dava-j/te ja/ my po-jd-u/em! 

give’-IMP.2SG/PL ISG/PL.NOM PRF-gO-lSG/PL 

‘Let (thou/you) me/us go!’ 
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(6) b. 2Sg/pl Id-i/te! 

gO-IMP.2SG/PL 

‘Go!’ 

c. 3 Sg/pl Pust’id-et/ut! 

PART gO-PRS-3SG/PL 

‘Let him,her/them go!’ 

Conversely, in Yucatec Maya, one form is periphrastic, (7)a, and two forms are differ¬ 
ent synthetic forms, (7)b and c: 


(7) a. ipl 


b. 2 SG/PL 


c. 3SG 


Ko’ox j kay! 

HORT SUBORD sing 

‘Let’s sing’ 
tz’ib’ -t - e! 

Write -tr imp 
‘Write it!’ 

Wen -ek ( 0 )! 
sleep-IRREAL (3SG) 

‘Let him sleep!’ 

(Andrew & Ojeda 1994) 


Singling out mixed paradigms raises the issue of ‘multiple marking’. There can be 
a whole set of constructions associated with non-second-person values, sometimes 
even with second-person. Thus (8) shows the Russian ipl Imperative expressed in a 
variety of ways. 


(8) a. Davaj (te) pojdem! 

b. Pojdem (te)! 

c. Pojdem (te) - ka! 
‘Let’s go! (ipl) 


Spanish has alternative constructions for all persons, including second-person, shown 
in (9): 


(9) a. Vete! 

b. Ve! 

c. Que tu vayas! ‘Go! (2SG)’ 

Because of this multiplicity of constructions, there is a problem of choice. Should all 
the constructions be included in the paradigm? If not, which of them should be con¬ 
sidered as basic, and consequently included in the mixed paradigm? This is a special 
issue not to be discussed in detail here. But at least two criteria of choice can be men¬ 
tioned. First, one may choose the construction with the highest pragmatic neutrality; 
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thus, in (9) Ve! is more abrupt than the neutral Vete!, and (8)c is too unceremonious 
as compared to (8)a and b. Second, one may choose the construction which combines 
with more verbal lexemes; for example, the construction in (8)a does not have restric¬ 
tions on combining with verbal lexemes or aspectual forms, whereas (8)b certainly 
does. Thus (10) a inarguably has a straightforward Imperative meaning, but (io)b has 
a strong inchoative component of immediacy, and (io)c is acceptable only under spe¬ 
cial contextual circumstances. 

(10) a. Davaj(te) citat'! 

b. Citajem! 

c. ? Citajemte! 

‘Let’s read!’ 

3. INCOMPATIBILITY OF THE CONCEPTS ‘MIXED PARADIGM’ AND ‘VERBAL CATEGORY’. 

A regular verbal paradigm is characterized by the following features: 1) its members 
are verb-forms, 2) categorial marking is uniform throughout person/number oppo¬ 
sitions, 3) the form of a marking mechanism is specific for each category, and 4) 
one member of a paradigm has only one way of actualization. Mixed paradigms are 
incompatible with these properties of a ‘regular’ paradigm. First, members of mixed 
paradigms are not all verb-forms, even in the weakest sense of the term, when ‘ana¬ 
lytical’ forms Aux+V can be accepted as morphological (cases like (2)a and c, or (4)b), 
but not cases like (3), (4)a, or (7)a, when the marking mechanisms are scattered over 
the whole construction. Second, the categorial meaning is often marked differently 
in different persons, cf. (5), (6), and (7). There are two other features of mixed para¬ 
digms. First, forms associated primarily with other categories appear in them (e.g., 
subjunctive in (3)a and (7)c; future indicative in (4)a, present indicative in (4)b, (6)c, 
and (9)c; hortative in (7)a). Second is the existence of multiple mechanisms of mark¬ 
ing imperative for one and the same person-value ((9) and (10)). The first two proper¬ 
ties of verbal paradigms are indisputably basic and crucial, and a violation of them 
is incommensurable with its essence. The commensurability of the last two features 
can be disputed. It is common knowledge that many grammatical markers are mul¬ 
tifunctional and can be used as markers of several grammatical meanings, as a rule 
distinguishable from one another in the lexico-grammatical context. The existence of 
a multiplicity of forms marking one and the same person-value can be explained, as 
was mentioned earlier, by pragmatic and combinatorial factors. 

Thus the concept of mixed paradigms contradicts in at least two respects the prin¬ 
ciples of‘regular’ verbal paradigms. 

4. THEORETICAL IMPLICATIONS OF THE RECOGNITION OF MIXED PARADIGMS. 

Acceptance of a mixed paradigm requires justification for extending the Impera¬ 
tive category to non-second-person periphrastic constructions which violate the 
basic properties of a verbal paradigm. But refusal to accept mixed paradigms is also 
unsatisfactory, on at least two grounds. First, from a cross-linguistic point of view, it 
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seems arbitrary to exclude strictly grammaticalized constructions from a given para¬ 
digm simply because they are not expressed synthetically in a particular language; 
cross-linguistic description deals with languages with a variety of marking patterns, 
and needs common theoretical grounds for describing any category Second, it is not 
satisfactory to exclude all non-second person forms, whether synthetic or periphras¬ 
tic, from Imperative paradigms (as in Mithun 1999:171 ff.) because, if even synthetic 
non-second-person forms which are morphologically parallel to second-person 
Imperative forms are not Imperative forms, what category are we to affiliate them 
with, declarative or interrogative? 

So we have a paradox. We cannot neglect some data, i.e., constructions function¬ 
ally identical to synthetic forms and expressing an immediate directive (and thus 
inherently imperative). But the existing theory cannot accept them because of its 
internal restrictions. This paradox is evidence that the existing theoretical framework 
is inadequate to the data and that there is a need for a new and more adequate frame¬ 
work. The theory that the imperative is a verbal category of mood predicts, given 
certain assumptions, that the imperative will be a verb form at least in all languages 
where this category exists and that there will be consistent marking of this category 
throughout the person/number paradigm. But there are languages in which impera¬ 
tive meaning is apparently expressed by periphrastic constructions which cannot be 
reduced to the notion of a verb-form and there is no consistency in marking elements 
of this paradigm. This fact refutes the assumption that the imperative is a universal 
verbal category. We have a typical example of Popper’s ‘method of falsification’ which 
considers the inability of a hypothesis to withstand refutation as evidence of its false¬ 
hood. In short, the hypothesis that the imperative is a verbal category does not with¬ 
stand efforts to refute it. 

To accommodate mixed paradigms, the new framework must be able to explain at 
least five types of evidence: 

1. the inclusion of verb-forms and periphrastic constructions in a single cat¬ 
egory; 

2. the multiple constructions used by different languages to mark each person- 
value; 

3. the frequent difference in marking of Imperative meaning in first, second, 
and third persons; 

4. the frequent use of forms associated with other verbal categories (causativ- 
ity, modality, subjunctive, hortative, future indicative) as markers of Impera¬ 
tive in non-second-person synthetic forms; and 

5. the reconciliation of the need for an addressee with the fact that the Direc¬ 
tive is issued to the first or third person. 

5. speech-act-based interpretation of the imperative. All this evidence can be 
easily explained within a framework where the imperative is interpreted as a category 
of speech act. The speech-act nature of the imperative is widely discussed (Hamblin 
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1987, Jakobson 1995/1976, Lyons 1977, Paduceva 1985, Sadock and Zwicky 1985, Wier- 
zbicka 1972,1995, etc.); it is also considered a form of directive (Searle 1975, 1976). 
Within a speech-act approach three specific claims are made. First, the imperative 
does not modify the verb but a proposition (Bybee 1985, Hamblin 1987, Foley & Van 
Valin 1984, etc.). Second, the semantics of the imperative is not a matter of subject- 
predicate relations inside a proposition, but is speaker-oriented and thus outside the 
proposition (Bybee et al. 1994, Grice 1957, Foley d~ Van Valin 1984, Jespersen 1992/1924, 
etc.): the speaker wants to cause cooperation or willingness, etc. Third, the semantics 
of the imperative is not a simple primitive but is a complex of semantic components 
(Wierzbicka 1972/1995, Hamblin 1987, Moutafakis 1976). But these three claims are 
made as separate considerations. They have never been put together as a system of 
interconnected properties. So proponents of the speech-act approach have never 
proposed that we should consider the imperative a special type of grammatical 
category uniting various synthetic forms (morphological forms) and constructions 
(‘syntactic forms) within one paradigm. The consensus remained that the central 
synthetic forms belong to a verbal category of mood and that the periphrastic con¬ 
structions belong to syntax. 

I incorporate the discoursal features of imperative in one system and claim that 
the imperative is a regular grammatical category, but of a special type. It is not a part- 
of-speech category but a ‘frame-forming type of category. Such categories are formed 
by the introduction of additional predicates which specify the content of a proposi¬ 
tion in certain ways. I discuss the implications of this approach for the theory of the 
imperative and demonstrate that within this approach the concept of mixed para¬ 
digms can be naturally accepted. 

6. PLACEMENT, GRAMMATICALLY AND MULTIPLICITY OF MARKERS IN MIXED PARA¬ 
DIGMS. If the imperative is a frame-forming category, it follows that the invariant 
component of this category is a proposition, not a verb. If so, then the imperative par¬ 
adigm is a set of constructions and the marker of the grammatical variable (impera¬ 
tive) can be positioned anywhere in the construction. The marker is not necessarily 
attached to the verb. Thus it is irrelevant whether members of the paradigm are con¬ 
structions or verb-forms, since both realize a proposition. 

One might question how we can tell whether a periphrastic construction is part of a 
grammatical paradigm. I suggest that in order for a formal change in a construction to 
qualify as a grammatical mechanism it must be grammaticalized and combine regularly 
with propositions to form the meaning in question. In the present example, these condi¬ 
tions are met. Every language has a set of permanent markers to encode the transition 
from declarative to imperative, or distinguish imperative from an interrogative. 

The existence of multiple marking mechanisms for each non-second person is, as 
factual evidence suggests, superficial. In such a set of‘synonymous’ constructions only 
one construction is neutral; the others encode imperative with all kinds of face work 
components (politeness, etiquette forms, hierarchy, etc.) or are not as regular as the 
basic one. Several languages have multiple constructions even for the second-person, 
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in the database used—Spanish, Russian, Mayan, and Mandarin. Mandarin, according 
to informants, has several options for second-person constructions with diverse par¬ 
ticles. The selection among them depends on a variety of social factors (e.g., polite¬ 
ness strategies, strength of pressure). 

Thus, the selection of basic constructions (the ones least loaded with pragmatic 
additions) constituting a mixed paradigm is governed by quite strict grammatical 
rules and is never random. In short, the mixed paradigm meets the properties of 
formal grammaticality which can be expected from any grammatical category. 

7. components of imperative meaning. The composite meaning of the imperative 
explains why it can be (and actually is) encoded differently in different person-value 
constructions, why it uses ‘alien forms in non-second-person construction, and how 
the addressee’s role is preserved in all constructions. 

I claim that the semantic structure of the imperative (considered as a marked 
member in the opposition declarative-imperative) includes at least the following 
four components: 

1. an appellative component (Jakobson’s ‘conative function), 

2. a causative/volitional component (which, if necessary, can be broken into 
volition and causation), 

3. the proposition represented by a content verb, and 

4. ‘framing-inclusion’ relations (to frame and to be framed). 

Identification of these components is supported by three independent types of evi¬ 
dence: cross-linguistic (a list of types of marking mechanisms reiterated in different 
languages), pragmatic (analysis of semantic components singled out within commu¬ 
nication studies), and logical (decompositions of Imperative meaning by logicians for 
formal language description). The lists of proposed components gathered from these 
three independent sources are amazingly similar. On my analysis, then, the Impera¬ 
tive situation is a semantic hybrid, and consequently a syntactic hybrid, of diverse 
components. It encompasses three component situations: the appellative situation, 
the causative/volitional situation, and the content situation. 

8. mechanisms of marking imperative. Each component situation has its own 
predicate and its own set of arguments. As in any hybrid, the component situations 
interact and overlap. Arguments interact, and predicates interact, including first-order 
predicates in the framed situation and second-order predicates in the framing situa¬ 
tion. Each type of interaction must be marked explicitly, but the question of how it 
can be marked needs elaboration. 

8.1. arguments. The appellative situation necessarily includes two arguments: 
Addresser and Addressee. It may or may not include a third participant, a non-direct 
interlocutor. The causation/ volition situation includes a Causer (issuer of causation 
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and bearer of volition) and a Causee (someone who is caused to act, but can have 
wilfulness of his own). The content situation has at least one argument, a ‘Doer’ (D), 
though it can be more complicated in languages with prototypical passive-based 
imperatives (e.g., Austronesian languages), where the Patient of the content situation 
is more important than D (Manaster Ramer 1995, Polinsky 1992). Blending the three 
sets of participants results in a new set with three macroroles: the speaker S, the lis¬ 
tener L, the third party T. D does not exist on its own; it overlaps with S, L, or T. 

The macrorole S (Speaker) must combine the roles of an Addresser and a Causer 
and can overlap with D. Since it is always associated with isg, it does not need mark¬ 
ing, unless it overlaps with D. The macrorole L (Listener) must include the Addressee 
role and can also include the roles of Causee and Doer; this combination is the pro¬ 
totypical combination in the imperative situation. L is associated with the second 
person and is part of a central specialized form of the imperative. The macrorole T 
(third party) becomes part of the imperative situation only if it overlaps with D; this 
role must be explicitly marked. 

Thus, the marking of the imperative must obligatorily include marking D if D 
is not the Addressee. Even where D is the Addressee (i.e., in second-person con¬ 
structions), marking of D is strictly obligatory if specialized pronouns are used for 
imperative as in Palauan, which has ‘hypothetical’ pronouns for imperative (Josephs 
1975:110). D is also obligatorily marked if person/number categories are marked on 
the imperative form of the verb. The mechanisms for marking D are standard: verb- 
agreement and/or pronouns (Sadock & Zwicky 1985:171). 

(11) a. English Go! Let us go! Let him go! 

b. Russian Idi/Idite! Davaite poidem! Pust'idet! 

c. German Gehe! Geht! Wollen wir gehen/ Lafi uns gehen! Lafi ihn gehen! 

The marking of D is an indispensable component of an imperative construction, not 
an agreement category as in verbal categories. Evidence from cross-linguistic com¬ 
parisons proves that all other personal markers are optional, though some languages 
(Russian, Spanish) mark person/number values for Listener/Addressee on Auxiliaries 
or particles separately from person/ number marking on the content verb that com¬ 
municates D, as in (12) and (13). 

8.2. predicates. The predicates constituting an Imperative situation are of two types, 
‘framing’ (appellation, causation/volition) and ‘framed’ (content verb). The framing 
predicates define the nature of the speech act; the framed predicate defines its content. 
The framed predicate must be actualized by the content verb; the only variation is in 
the morphological form of this verb, which I will discuss shortly. The framing part 
must also be actualized. But the way it is actualized varies. It can be encoded by a spe¬ 
cial inflection on the content verb, as is the rule for second-person Imperatives and 
sometimes for other synthetic constructions. In first and third person constructions 
it can take the form of an actualization of one of the framing predicates. Selection 
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among these predicates evidently is based on what was diachronically chosen as a 
semantic dominant of a directive: causation, volition, or the need for the addressee’s 
approval, etc. The appellative predicate can be actualized directly, as in Russian, where 
it combines with a component of expectation of cooperation on the part of L. Dia¬ 
chronically this auxiliary goes back an imperative with the meaning ‘give’, which is 
preserved in the content verb. 

(12) Dava-j/ -te ja po-jd-u! 

[give] -2SG.IMP 2PL.IMP 1SG.NOM PRF-gO-FUT.IND.lSG 

‘Let me go’; literally,‘Hey-you (sg/pl), I will go!’ 

More commonly the appellative predicate is represented indirectly via its arguments: 
by vocatives or by number-agreement marking of the Addressee, as in Spanish. 

(13) Dej-a -me leer la nota! 

let -sg.adr -sg:refl read the note! 

‘Let me read the note!’ 

Actualization of the appellative predicate evidently highlights the recognition of the 
existence of L or of the need for L’s cooperation/agreement. The fact that the address 
function of imperative is explicitly encoded in many constructions is evidence that 
it is inherently present in each of the three-person forms or constructions of the 
imperative paradigm. 

Causation/volition predicates are more commonly actualized than the appellative 
predicate in imperative constructions. They can be marked straightforwardly by 
grammatical causatives, though no longer preserving the causative grammatical 
meaning (English let, German lassen ‘let’). Alternatively, they can be marked by par¬ 
ticles (Russian pust', going back to causative pustit' ‘release/let go’), by other delex- 
icalized verbs, or by other auxiliaries which go back to grammatical inchoatives 
originating from ‘movement’ verbs as with Hebrew ba ‘come’ (Malygina 1992:146) and 
Yucatek Mayan ko’ox/ko’on-e’ex‘go (Hofling & Ojeda 1994:282). 

In summary, the framing predicates must be actualized, but the way they are actu¬ 
alized depends on the selected dominant, level of grammaticalization, or paths of 
grammaticalization (Bybee et al. 1994, Hopper & Traugott 1993). There can even be 
zero actualization, marked only in the demonstrated dependent status of the content 
verb (e.g., in Spanish que-constructions). 

8.3. framing. Marking the framed status of the propositional predicate is obliga¬ 
tory. There is a variety of mechanisms to actualize this meaning. They all mark the 
non-actual character of the action, its shift from an objective reality to a desired or 
possible one. So the content verb may be ‘marked’ as an infinitive (as in English, 
German), a future (Russian), an optative form (as in Turkic languages), or a Subjunc¬ 
tive form (Romance languages). Or the content proposition may be introduced by 
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subordinators (all equivalents of that) like French que, Spanish que, Rumanian sd, 
Mayan -j-. Sometimes the framed status of the predicate in indicative forms is marked 
by a different word order, as in German Gehen wir schneller! ‘Let’s go quicker!’. 

8.4. AUTONOMY AND GENERIC FUNCTION OF MARKING MECHANISMS. Almost no 

Imperative construction explicitly encodes all the semantic components which mark 
a shift from declarative to imperative; only some are marked. Marking the shift from 
indicative to imperative within each of the three domains discussed above (person of 
D, ‘framing’, and ‘inclusion’) is obligatory, but the complexity of imperative meaning 
makes it impossible to predict which marker is selected. 

The independence of marking each component explains the existence of diverse 
ways of marking the imperative within a paradigm and at the same time preserves 
its holistic meaning. I speculate that different markings realize the component(s) of 
the complex imperative meaning that were historically treated as dominant. Thus 
first-person constructions often imply the cooperation of L, or even L’s willingness to 
participate (cf. German wollen ‘will/want’). Since the whole set of markers is never 
realized within one construction, those which are realized serve as generalized ‘repre¬ 
sentatives’ of the whole set of semantic components constituting the imperative mean¬ 
ing. Which of the markers is actually present in the construction is language-specific. 
One can predict that there will be some marking, and one can specify the possibilities 
from which the marking can be chosen, but one cannot predict the actual choice. 

9. conclusion. I have claimed that the mixed imperative paradigm better reflects the 
nature of imperative than the second-person synthetic paradigm. I argue that the pecu¬ 
liarities of mixed paradigms can be explained within a framework treating the imperative 
as a speech-act category rather than a verbal category. I have laid out this framework as a 
systematic complex. The imperative is a frame-forming category of a proposition, cluster¬ 
ing with declarative and interrogative. The semantics of this category is oriented to the 
interlocutors and is a composite of numerous components. Within such an approach, 
the mixed paradigm is a grammatically regulated set of person/number oppositions with 
predictable linguistic behaviour, not a disorganised set of‘imperative syntactic construc¬ 
tions’ with random properties. 

In support of my claims I discuss diverse types of evidence: factual evidence, evi¬ 
dence of inconsistences in the old approach, and evidence of the greater explanatory 
force of my position. These are the three basic types of evidence relevant to a change 
of theoretical approach in science. 
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NEGATIVE-IMPERATIVE CLITIC PLACEMENT IN ITALIAN: 
SYNTAX OR PHONOLOGY? 


Floricic Franck 

ERSS - Universite de Toulouse II - Le Mirail 


the question of clitics 1 is a very complex one constituting a field of inquiry in 
its own right. This paper addresses the question of clitics in negative imperative con¬ 
structions in Italian. As is well known, Italian clitics precede finite verb forms, as 
indicated in (1). However, the examples in (2) show that clitics follow the verb in non- 
finite contexts, as well as in the imperative. 


(1) a. Loprendo 

it take-iSG. 

I take it. 

(2) a. Presolo per mano 

taken-him for hand 
Once I took his hand, 
c. Prendilo 
take-it! 


b. Bisogna che lo prenda 
[it is] necessary that it take 
I have to take it. 
b. Prendendolo 
taking-him/it 

d. Penso di prenderlo 
think-iSG. of take-it 
I think of taking it. 


Now, in the negative imperative and when the infinitive appears after a modal, both 
linear orders are possible: 


(3) a. Non lo fare / Non farlo 
not it do / not do it 
Don’t do it! 

c. Non lo devi fare / Non devi farlo 
not it must-2SG. do / 
not must-2SG. do-it 
You mustn’t do it. 


b. Lo devi fare / Devi farlo 

it must-2SG. do / must-2SG. do-it 
You have to do it. 

d. Lo puoi fare / Puoi farlo 
it can-2sg. do / can-2sg. do-it 
You can do it. 


The problem is how to explain this double pattern in the negative imperative. Is it a 
syntactic phenomenon or a phonological one? 

1. kayne (1992) and the notion of ‘empty modal’. Kayne (1992) introduces the 
notion of empty modal to account for Clitic-Verb order in the Italian negative imper¬ 
ative. According to Kayne’s hypothesis, the infinitive is embedded under the empty 
modal. The empty modal is in turn licensed by the negation non , that is, by the head of 
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the NegP projection. This analysis therefore allows Kayne to see the Clitic-Verb order 
as an instance of clitic climbing. In a sentence like Non lofare!‘Dorit do it!’ the object 
clitic lo is raised up to the empty modal, as shown in (4). 


( 4 ) 


Non 


far - lo 


A 


Given that the infinitive is embedded under the empty modal, the existence of this 
empty modal seems to account both for the linear ordering and for the appearance 
of the infinitive. Despite its interest, this analysis has the drawback of putting on the 
same level expressions like Non farlo! and Non devi farlo! ‘you mustn’t do if, and it 
fails to capture the relationship between the non-injunctive use of the infinitive and 
the injunctive one (cf. examples such as Non farlo sarebbe un errore ‘not to do it would 
be a mistake’). On the other hand, it is also possible to consider that in the case of Non 
farlo!, the infinitive is selected because of its semantic neutrality. From this point of 
view, there is no need to postulate the existence of any underlying empty modal. 


2. graffi (1996) hypothesis. Another account is that of Graffi 1996. According to 
Graffi, in the negative imperative, the negation and the verb are raised to the head 
comp of the Complementizer Phrase (cp). In this case, the sentence Non farlo! shares 
the same syntactic representation as a sentence like Non prendetelo!, as shown in (5). 

(5) [cpfcoMP Non prendete]] [ip lo] [cp[comp Non far]] [ip lo] 

Don’t take it! Don’t do it! 


Agr is not, in the imperative, an autonomous projection but a feature of the head 
comp; according to Graffi, this Agr feature is defective in the sense that it contains 
number features but no person features. Therefore, in the case of Non lo fare!, the 
clitic precedes the verb because the features of Agr in comp are weak in the sense of 
Chomsky (1995, chapter 3) 2 . As a matter of fact, in the minimalist theory, all move¬ 
ment is triggered by the need for feature checking. Depending on the strength of the 
feature, the movement is either overt (when induced by a strong feature) or covert 
(when induced by a weak feature). Therefore, given the weakness of the Agr features 
in comp, overt raising of the verb is not required: the verb raises in comp at LF and 
the imperative value is brought about by the negation in comp. 

Now the contributions of Kayne and Graffi touch on many interesting questions that 
deserve a more detailed discussion. The main point is that their approaches share the 
presupposition that an autonomous syntactic analysis can adequately account for the 
distributional properties of clitics; in this kind of approach, the only question is to iden¬ 
tify, within the functional organization of the sentence, the base position and the land¬ 
ing site of the items. However, the points to which I would like to draw attention are the 
semantic neutrality of the infinitive and the prosodic weakness or deficiency of clitics. 
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3. description of the Italian data. First of all, recall an important fact about clit¬ 
ics in Italian: they never precede the negation non or the infinitive in root contexts; 
therefore, the essential frames are those in (6)a-b. 

(6) a. [[non] (cl)] b. [\Vinf\ (cl)] 

The converse holds for contexts of clitic climbing: they always precede modal verbs 
in root contexts, as in (6)c-d. 

(6) c. [(cl) [modalV]] d. [[Vinf] (cl)] 

Now when we put these frames together, we obtain the structures in (7). 

(7) a. [[[non] (cl)] [[Vinf\ (cl)]] b. [[(cl) [modalV]] [[Vinf] (cl)]] 

At this point, two questions arise. Given the structures in (7) a-b, why does Italian pro¬ 
hibit sentences like *Non lofarlo or *Lo voglio farlo, with reduplication of the clitic 3 , 
and why is the so-called clitic climbing phenomenon limited to modal contexts? To 
answer the first question, we suggest that the two clitics are adjacent in the Verbal 
domain, as in (8). 


( 8 ) 

non 



Now a configuration like (8) must be ruled out because of an ocp violation: two adja¬ 
cent elements of the same nature cannot co-occur in the same domain 4 . Therefore, 
one of the two clitics must be deleted. To answer the second question, we must recall 
that from a semantic point of view, the infinitive stands, in the mood system, in the 
same position as the present in the tense system, the nominative in the case system 
or the third person in the person system 5 . In other words, it is the neutral element of 
the system. It is neutral insofar as it lacks any specification, and it is neutral insofar 
as it can be assigned a default value. Of course, this characterization of the infinitive 
is not new. It is commonly assumed among philologists that the infinitive is semanti¬ 
cally unspecified. This idea is defended by Skytte (1983:24): 

At the semantic level, we can say that the infinitive... is the verbal form which 
expresses in the most neutral way the pure content of the verbal stem. A content 
which, naturally, in a given context, lets itself be modified in different ways. 

In other words, the infinitive is a non-specified form which is nonetheless contextu¬ 
ally specifiable; and its specifiable nature stems from being a tense in posse, as Guil¬ 
laume (1929) says 6 . This is the reason why infinitival imperatives are compatible with 
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different linear orders, while, for example, negated gerunds only allow enclisis. From 
this point of view, we can consider that the double possibility of Non lofare! and Non 
farlo! is at least in part governed by the syntactic and semantic properties of negation 
and the infinitive. 

4. prosodic constraints on clitic placement. From a prosodic point of view how¬ 
ever, it looks like the two configurations are certainly not equivalent. Now this non¬ 
equivalence is mainly the result of the properties of the negative particle non. In keeping 
with Zanuttini (1991) and against Belletti (1990), it should be noted first of all that the 
Italian negation non differs fundamentally from clitics. While clitics like lo, la, le, li, ci, 
vi are atonic and prosodically dependent, non can indeed bear stress. More importantly, 
non can be used in a wide range of contexts in which only independent content words 
are licensed. As shown in examples (9)a-c and (9)d-e, non can appear as part of a con¬ 
junct or of a dependent clause. 

(9) a. Dimmi se ti piace o non (tipiace) (Serianni & Castelvecchi 1989:506) 
tell me if you like it or not 

b. Gli sposati e non 

the married and unmarried 

c. Noi gli facciamo molte domande, lui qualche volta ride, risponde ad alcune 
e non ad altre, si vede bene che evita certi argomenti (Levi 1958:22) 

We are asking him a lot of questions; sometimes he laughs; he is answering 
some of them but not others; it is clear that he is escaping some topics. 

d. A che cosa potevo io aspirare ormai se non ad un uomo come lui? 

(Moravia 1988:415) 

What could I desire other than a man like him? 

e. Vado piu spesso in Italian che non in Germania 

I am going to Italy more often than to Germany 

From the examples in 9, it is clear that non shows syntactic and prosodic properties 
that make its clitic status questionable. These examples also show how different are 
the forms ne in French and non in Italian. Actually, the properties of non clearly show 
that it has the status of Prosodic Word. In a sentence like Non lo fare!, the negative 
morpheme defines a domain—the Prosodic Word —of which it is the head or nuclear 
element. At the same time, however, there is no reason to associate the clitic to the 
initial foot rather than to the following one. Therefore, I would like to suggest that in 
Non lofare!, the clitic actually is amphipodic. In other words, as shown in Figure 1, the 
object clitic depends at the same time on the foot dominating the syllable non and on 
the foot whose head is the strong syllable of the infinitive. 

It should be observed that the amphipodicity/amphicliticity hypothesis is not new: 

In most languages, enclisis does not occur when the imperative is introduced by 

a tonic word, which allows the amphiclisis of the pronoun... Negation belongs to 
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Figure l.The object clitic depends at the same time on the foot dominating the syllable 
non and on the foot whose head is the strong syllable of the infinitive. 

the tonic introductory elements (French Ne m’aidepas ; Italian Non mi aiutate), 
as well as strengthening particles (Old French. Si m’aidez, or m’aidez ; Italian 
Or m’aiutate), or conjunctions (French prends ton luth et me donne un baiser 
(Musset)). (Lausberg 1971: §723 and §725,2., a), ()), III. (cf. p. 4, Lausberg) 

We should nonetheless notice that in some cases—namely as a result of both the tight 
relationship between negation and the clitic and the focus prominence on negation— 
the initial consonant of the object clitic lo can assimilate the final consonant of non: 
e.g., Nollofare! [nol:o'fa:re] ‘don’t do it!’. In this context, non therefore shows a high 
degree of cohesion with the clitic. This also recalls expressions like Dammi quello! 
['dam:i'kwel:o] ‘give it to me!’. In both cases, we therefore find an optimal configura¬ 
tion of two trochaic feet, as in Figure 2 (overleaf). 

As shown in Figure 2A, the clitic of the sentence Dammi quello! prosodically and 
syntactically depends on the verb alone. Conversely, in Nollo fare!, the syllable cor¬ 
responding to the clitic is delinked from the following foot and is exclusively attached 
to the preceding one. Now the assimilation of [n] can be interpreted as the segmen¬ 
tal correlate of this prosodic reorganization. From that point of view, the expressions 
Dammi quello! and Nollo fare! show rather similar prosodic configurations. From a 
rhythmical standpoint, they share the same optimal alternation between strong and 
weak positions, as in (10): 

(10) a. J * b. * * 

* X XX 

X X X X x X X X 

Dam mi quel lo Non lo fa re 

As shown in (io)a-b, non and the strong syllable of the imperative form have a 
prominence at the highest level of the prosodic hierarchy. Now this prominence also 
favours the second position of the clitic in the negative imperative, and the semantic/ 
prosodic weakness of the adjacent clitic enhances in turn the focus prominence of 
the strong syllable: 
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PrW 


F 






PL [nas] 

[lat] PL 

(Cor) 

(Cor) 


Figure 2. An optimal configuration of two trochaic feet. 


(L)’accent d’un mot ou dune syllabe est d’autant plus fort qu’ily a plus de mots 
ou de syllabes sur lesquels il domine... S’il faut appuyer fortement sur un mot, 
mettez pres de lui un autre mot sur lequel le sens n’exige pas qu’on appuie; et le 
mot accentue, quand meme il ne se trouve ni au commencement ni a la fin de 
la phrase, aura une place avantageuse; car l’accent est mis en relief par un repos 
d’accent qui l’accompagne (Weil 1844:91). 


(...) the stress of a word or of a syllable is all the more strong since it dominates more 
numerous words. (...) If a word must be strongly strengthened, just put next to it another 
word which need not be strengthened from a semantic point of view; then, the stressed 
word will have a prominent place even though it is not at the beginning or at the end of 
the sentence; for the stress is focused by a ‘stress pause’ which goes together with it. 

Now the absence of such a prominence on non brings about the obligatory enclisis in 
sentences such as Nonfarlo sarebbe un errore. As pointed out by Nespor (1993:237), 
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Figures. The clitic \o attached to the preceding foot. 

‘In Italian, the main stress of the rightmost word in the sentence is the strongest of 
the whole sentence, in the absence of particular emphasis’. Therefore, a sentence like 
Nonfarlo sarebbe un errore has End Rule Final. On the other hand, the second sin¬ 
gular imperative, being a form on-focus, can bear the main phrasal/sentential stress. 
That’s why in this case, the strong syllable of the imperative shares with non the major 
prominence. In short, the Pitch Accent Prominence Rule (PAPR) takes precedence 
over the Nuclear Stress Rule (NSR). 

The next question is how to analyze the other variant of Non lo fare!, namely Non 
farlo! First of all, it should be noted that the final vowel of the infinitive fare is elided 
before the initial consonant of the enclitic. In this case too, we should interpret the 
elision as a cue for the metrification of the clitic at the foot level 7 . 

In Figure 3, the clitic lo is indeed attached to the preceding foot, which preserves 
its trochaic structure. We should, however, add another important fact: the expression 
Nonfarlo! shows two adjacent stressed syllables, non and far. Therefore, we have a clash 
which is, in this case, tolerated because it represents a minimal clash 8 . In other words, as 
shown in (11), we find two adjacent prominences at the level of Prosodic Word; the clash 
thus creates an arhythmic structure which doesn’t trigger any ‘repair strategy’. 


(11) 


* 

* * 

* * 

* * 

Non far 


lo 


Moreover, in some contexts, the first syllable of the infinitive can in turn receive the 
uppermost prominence of the negation—needless to say, this is a double prosodic 
pattern. 

(12) 7 : 

* * 

* * * 


Non 


far lo 
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On the other hand, when the negation is followed by an imperative form like Fallo! 
‘do it!, the clash occurs at a higher level; therefore, the violation is greater. It is worth 
pointing out that the second person singular of the imperative has an intrinsic focus. 
Given that the stressed syllable of the imperative form is also the most prominent in 
the sentence, and given that non has as its domain the whole sentence, the result is a 
clash at the highest level of the prosodic hierarchy, as in (13) 

(13) I I 
* * 

* * * 

*Non far lo 

Therefore, we have in this case a prosodic and semantic clash which requires the 
implementation of a repair strategy; this repair strategy is brought about using a neu¬ 
tral, amodal form, the infinitive. 

5. the diachrony of clitic placement. At this point, I would like to mention a dia¬ 
chronic fact which seems to be relevant to this discussion. In Medieval Italian, expres¬ 
sions like Non lo fare! were very common; the clitics indeed couldn’t stand in initial 
position, and they needed the presence of a tonic form at the beginning of the sentence 
or of the period. Therefore, they usually stood in the second position in the sentence- 
the position known as Wackernagel’s position. But Wackernagel’s main work dates to 
1892. The end of the nineteenth century gave rise to a large amount of literature on 
word order. Adolfo Mussafia (1886) showed, on the basis of a corpus of medieval texts, 
that in Old Italian, clitics were absolutely excluded from the first position of the sen¬ 
tence; thus, expressions like Pregoti [pregoti] ‘I pray you’, Pregailo [pre'gajlo] ‘I prayed 
him)’, or mostrollo [mos'troho] ‘(s)he showed if were very common. As we can see, 
the clitic placement rules have undergone a deep change from the Old to the Modern 
Standard Italian system. However, Mussafia does not really explain how and why this 
change took place. It is not the aim of this paper to answer this difficult question. None¬ 
theless, we should mention an interesting hypothesis put forth by Weil (1844): that is, 
the general rhythm of the language has evolved from descending to ascending. From 
this point of view, Italian has evolved from End Rule Initial to End Rule Final. There¬ 
fore, this change would have shifted from enclisis to proclisis. This hypothesis is attrac¬ 
tive, but it deserves a more detailed analysis than can be carried out here. Mussafia’s 
crucial point is the synchronic manifestation of the constraints mentioned above. That 
is, the second position of the clitic in expressions like Non lo fare! is a relic of the old 
constraint on the clitic order. In uassafia’s own words, ‘We find yet a relic of the old use 
when the negation Non precedes the verb; the ancients said: Non lo ajutate for the same 
reason that they said Or lo ajutate’ (Mussafia 1886). 

It must be recalled that from a Wackernagel-Mussafia perspective, the constraint 
on the placement of clitics in Old Italian and in other Indo-European languages has an 
essentially phonological raison d’etre. In other words, clitics are phonologically weak ele¬ 
ments which need a host. If we accept the hypothesis that Old Italian had a descending 
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d a i d a i m: i 


Figure 4. Gemination of the initial consonant of the clitic mi when attached to a 
monosyllable imperative. 

rythm, enclisis at the beginning of the sentence therefore naturally follows. Evidently, 
Modern Standard Italian has maintained and even generalized enclisis in the impera¬ 
tive because once again, the imperative is a focused form. Now the enclitic imperative 
forms show a very interesting phonological pattern: when attached to a monosyllabic 
imperative, the initial consonant of the clitic is systematically geminated (Figure 2A, a 
portion of which is here repeated as part of Figure 4). In the imperative forms of the 
verbs dare ‘to give, fare ‘to do, and stare ‘to stay’, gemination indeed is the result of a 
delinking-relinking process. 

We therefore have a change from a moraic trochee to a syllabic trochee; but in any 
case, the trochaic pattern is preserved. The clitic is thus incorporated into the foot of 
which it constitutes the weak element. That the imperative forms da’, fa’, sta’, va’, di’ 
are truncated is confirmed by the other imperatives of Italian as well as those of other 
languages. In fact, crosslinguistically, imperatives are often truncated forms. Now in 
the case of Stammil, Fammil, Dammil, Vammil, Dimmil, the question is why gemina¬ 
tion is preferred to, say, vowel lengthening or simple adjunction to the base form? In 
other words, why don’t we find something like Daimi! [’dajmi] or Dami! ['da:mi]? The 
question is all the more interesting since forms like Pregailo [pre'gajlo] ‘I prayed him’ 
are usual in Old Italian. Actually, there is a bundle of convergent criteria which results 
in gemination. As mentioned, gemination can be considered a cue for the metrifica- 
tion of the clitic at the foot level; in a parallel fashion, the final syllable of the pres¬ 
ent indicative forms sanno, danno, stanno,fanno mustn’t be considered extrametrical, 
but indeed forms part of the foot. Moreover, the syllabic trochee is the unmarked foot 
of the Italian metrical system. Therefore, the construction of a syllabic trochee con¬ 
verges on the unmarked. It should also be pointed out that in standard Italian, vowel 
lengthening occurs in Stressed Penult syllables only: in other words, Italian absolutely 
prohibits long vowels in stressed final syllables. Now in the case of words like Dai!, 
the head of the falling diphthong becomes final when the glide is deleted; therefore, 
it cannot be lengthened. The same holds for old forms like mostrommi, in which the 
initial consonant of the clitic is lengthened. We should at this point mention the illu¬ 
minating analysis of Schuchardt (1874:14): 
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Toute voyelle finale accentuee en Italien est breve. Par consequent, si Ion ajoute 
par exemple a andro l’enclitique vi, pour que la voyelle o conserve sa quantite, 
il faut que le v s’allonge et que le groupe devienne androwi, car on ne peut pas 
avoir de voyelle breve dans aucune syllabe accentuee et ouverte, a la seule excep¬ 
tion de la syllabe finale. 

(Every stressed final vowel is short in Italian. Therefore, if for instance we add to andro 
the enclitic vi, the v must be lengthened for the vowel o to maintain its quantity, thus 
resulting in the group androwi. For we cannot have a short vowel in any stressed and 
opened syllable, apart from the final syllable.) 

Schuchardt’s account rightly insists on the importance of the constraint which in Ital¬ 
ian prohibits word-final long vowels: vowel lengthening essentially occurs in word- 
internal penultimate stressed syllables. Now the conflict between the constraint which 
prohibits word-final long vowels and the one which imposes bimoraicity on (penulti¬ 
mate) stressed syllables probably is a key aspect of the so-called phenomenon of Rad- 
doppiamento Sintattico. 

6. to sum up, I have argued that the double possibility of enclisis and amphiclisis 
in negative imperative is the result of the interaction of various constraints. First, 
the infinitive is semantically neutral and can be associated with a set of contextually 
ascribable values. Second, clitics are phonologically weak elements which need a host. 
In expressions like Non lofare! we are dealing with a rhythmically optimal configura¬ 
tion, as we find a harmonic sequencing of Strong and Weak positions. At the same 
time, it is worth noting that the clitic stands in a position it filled systematically in Old 
Italian. From this point of view, the second position of the clitic can be considered a 
relic of a constraint systematically observed in Old Italian. 


I would like to thank Lucia Molinu and especially Prof. Larry Hyman for their comments on 
earlier versions of this paper. Needless to say, I am solely responsible for any shortcomings. 
According to Chomsky (1995:198), ‘Agr is a collection of cp-features (gender, number, 
person). ...French-type languages have ‘strong’Agr, which forces overt raising; and English- 
type languages have ‘weak’Agr, which blocks if. 

Kayne (1992) mentions Rizzis observation that in the course of acquiring Italian, his son 
passed through a stage in which he produced sequences like Non lofarlol. Various Italian 
dialects also show this kind of clitic reduplication. This (complex) phenomenon, however, 
deserves a much more detailed account than can be carried out here. 

The ocp (Obligatory Contour principle) was introduced by Bantuists to account for dis¬ 
similation phenomena involving tones. 

As Peskovskij (1956:131) puts it,‘Like the nominative case..., which is regarded as a simple, 
naked name of the object, without any of the complications in the thought process which 
are introduced by the forms of the oblique cases, so the infinitive, because of its abstract 
nature, appears to be a simple, naked expression of the idea of action, without those com- 
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plications which are introduced into it by all the other verbal categories’ (quoted from 
Zirmunskij 1966:79). 

This analysis also recalls that of Heidegger (1952:77): 

‘L’infinitif, tel qu’il est entendu dans 1 ’appellation latine, est une forme de mot qui, 
pour ainsi dire, coupe ce qui est signifie en elle de tout rapport significatif determine. 

La signification est detachee (abstraite) de tout rapport particulier. En raison de cette 
abstraction, l’infinitif se borne a rendre ce qu’on se represente d’une fa^on generale 
dans le mot. C est pourquoi on dit dans la grammaire actuelle: l’infinitif est le «con- 
cept verbal abstrait». Ce a quoi on pense, l’infinitif se contente de le saisir et concev- 
oir abstraitement et en general. II designe uniquement cette idee generale. Dans notre 
langue l’infinitif est la forme d’appellation du verbe. Dans la forme de l’infinitif, et 
dans la signification quelle fait apparaitre, reside un manque, un defaut’. 

Further evidence for the metrification of the clitic at the foot level comes from ‘dactylic 
shortening’; as a matter of fact, the stressed vowel of a trochaic form like posa [’po:za] 
‘put!’ is lengthened, while the stressed vowel of the enclitic form posalo [pozalo] ‘put it!’ 
is shortened when a clitic is added. Interestingly, the resulting dactyl recalls the structure 
of verbal forms like posano [pazano] ‘they put’, in which the addition of the third person 
plural morpheme -no triggers the same ‘dactylic shortening’. 

Recall that according to Nespor amd Vogel (1989), adjacent *’s on the first three levels of 
the grid represent minimal clash in Italian. 


* * word 

* * foot 

* * syllable 

P P 

Therefore, (a) is an instance of a clashing configuration, but not (b) (c) or (d): 


(a) I : 

* X- 

Sara forsepartito 
“he may have left’ 



* A 

Sara ritornato 
‘he will have returned’ 


(c) : (d) 

A A 

Avra mangiato 
‘he will have eaten’ 


A * 

A * 

A A A 

Quattro grandi libri 
‘four big books’ 
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LANGUAGE SHIFT IN PROGRESS: EVIDENCE FROM MANDARIN 
CHINESE/ENGLISH CODESWITCHING 
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this paper 1 presents preliminary findings of a research project that investigates the 
linguistic consequences of the contact between Mandarin Chinese and English in the 
United States. The paper focuses on the analysis of language shift in the Chinese chil¬ 
dren who recently came to the United States. By examining the data of codeswitching 
(henceforth CS) 2 collected from three Chinese children and their parents, the paper 
demonstrates that CS by the children is subject to fewer structural and functional 
constraints (e.g., Pandharipande 1990, Singh 1985, Sridhar & Sridhar 1980) 3 . Based on 
such evidence the paper proposes that the children are undergoing language shift, 
viz., Mandarin Chinese, the childrens mother tongue, is being gradually replaced by 
English as the primary means of communication and socialization. 

The paper first provides the theoretical framework for this research. Then it makes 
a brief introduction to the demographic background of the community in which 
the study was conducted, followed by the methodology of data collection. The bulk 
of the paper, data analysis, is presented next. The paper also includes a brief illus¬ 
tration of the factors that caused the children’s language shift. And it concludes by 
pointing out, on the one hand, the contribution it attempts to make, and on the 
other, its potential weakness. 

1. the theoretical framework. Since the publication of Fishmans seminal work 
on language shift in 1964, research on language attrition and shift in the bilingual or 
multilingual context has mushroomed (e.g., Brenzinger 1992, Paulston 1994). Research 
on the fate of linguistic minorities in multicultural settings, among which is the attri¬ 
tion and shift of the migrated language, has also been well documented (e.g.. Extra 
and Verhoeven 1999, Paulston 1994). 

The mechanism of language shift is multidimensional. CS is among such machin¬ 
ery. As Myers-Scotton (1992) points out, CS, with a shift in the host language, is an 
evident mechanism for language shift. Similarly, Schjerve (1998) also observes that CS 
may lead to the change in language in that 

(a) [CS] facilitates the functional-pragmatic switch to the dominant language 

and (b) [CS] frequently mediates change in the socially non-dominant language, 

potentially leading to convergence or even language death. 
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In light of the evidence from CS, this study shows that the Chinese childrens command 
of their native language, Mandarin Chinese, is waning. For one thing, compared with 
the adults, the children used the mixed code of Chinese and English in more domains 
(Fishman 1964,1968). For another, in the childrens CS the matrix language, Mandarin 
Chinese, permits the permeation of broader categories of English items, both lexical 
and grammatical, to the extent that it loses its host language status. 

Although evidence shows that English is replacing Chinese as the childrens pri¬ 
mary language, it is unlikely that the children will lose their ability to use Chinese 
completely. In other words, both Mandarin Chinese and English are going to be used 
by these three subject children, and most probably by their own children. In this 
sense, the term language shift in this study does not denote complete language dis¬ 
placement, or even language death 4 . 

2. the demographic background. This study was conducted in a community that 
is located in a college town in the Midwestern United States. Since more than ten per¬ 
cent of the university enrollees are international students, the population in this town 
is notably diverse in terms of nationality. In other words, the town is culturally multi¬ 
farious where the community under study finds itself. Even so, the absolute majority 
of residents in this town are still United States citizens. Consequently, English is the 
dominant language in most settings of social life. Other languages, including Manda¬ 
rin Chinese, are linguistic minorities. 

The subjects lived with family in one of the university apartment complexes for 
graduate students. There are approximately forty such complexes in the community 
under study. Since nearly one third of the residents in these apartment complexes 
are Chinese, this neighborhood is dubbed ‘Little Chinatown’. There were altogether 
around 1,000 Chinese living in this area at the time of data collection. 

Each of the three Chinese families under study comprised two parents and one 
child. The parents were between thirty-three and thirty-nine years old. The male par¬ 
ents in all these families were graduate students at the university in town. One of the 
female parents was also a graduate student at the same university, and the other two 
were part-time students at a local community college. At the time of data collection, 
the parents had been in the United States for about four years. And the child subjects, 
two boys and one girl, had been in the United States for approximately three and a 
half years. Two of the children were nine years old and the third was eleven. They had 
all been enrolled in a local elementary school for more than two years. 

3. the methodology. The data corpus of this study includes one hundred and fifty 
Chinese-English code-switched sentences that start in Mandarin Chinese. Among 
them seventy sentences were produced by the adults and eighty by the children. 

The data were collected primarily through participant observation, which was 
carried out from January through May 2000. The children were observed mainly 
in three different domains: at school, on the playground, and at home. The parents 
were also observed primarily in three different settings: at school, which covers sports 



LANGUAGE SHIFT IN PROGRESS 


223 


facilities and recreational centers, in grocery stores, and at home. The choice of these 
domains helped to ensure that the children and the adults would be observed mostly 
separately, which, in turn, precludes the possibility that the data are seriously compro¬ 
mised by the parents’ potential agenda that encourages the children to speak Chinese 
or, alternatively, English. 

The principal means of data recording was note taking. In order to minimize pos¬ 
sible bias associated with note taking, a tape-recorder was also employed for some 
sessions of the observation, particularly for the recording of conversations among the 
family at home. If conversations were recorded, the recordings were transcribed on 
the same day as they were made. 

To supplement the participant observation data with information collected in 
a more systematic, though undoubtedly less natural, manner, a questionnaire (cf. 
Appendix 5 ) was also designed for this study. The questionnaire consists of four sets 
of questions, totaling 64 questions. The first set of questions requires the subjects to 
choose the domain(s) in which they have ever code-switched. It also requires them 
to specify other possible settings of CS that are not given. The second set requires the 
subjects to express a specified idea, want, or feeling in the manner with which they are 
the most comfortable. The third set requires the subjects to choose the preferable sen¬ 
tence from pairs of code-switched and non-switched sentences. The last set requires 
the subjects to judge the acceptability of sentences code-switched in different ways. 

4. the data analysis. The study first compares the code-switched sentences actually 
uttered by the children with those produced by their parents. It then examines the 
data elicited by means of the questionnaire, which contain the information about 
the potential as well as actual employment of CS. The result of the data analysis indi¬ 
cates that although the childrens CS behavior resembles that of the adults’, there also 
exist patterns of difference concerning both the functional domain and structural 
constraints on CS. 

4.1 the commonality between children’s and adults’ cs. The children and the 
adults share a number of features in their CS behavior. Since the focus of this paper is 
on the discrepancies, not all the observed common characteristics are discussed below. 

As seen in examples (1-2), most of the English elements used in both the children’s 
and the adults’ CS were nouns or noun phrases. The Chinese equivalents of these 
expressions are either cumbersome or difficult to find, for example, to go (as is used 
in the restaurant) in (1), which was uttered by the child, and time-share in (2), which 
was used by the adult. For convenience in reading, in all the examples given in this 
paper Chinese expressions are presented in italicized pinyin (‘the Chinese phonetic 
alphabet’) instead of in characters. The expressions in English are boldfaced. 

(1) Wo bu xidng to go. Wd xidng zai na’er chi. 

I not want to go I want prep there eat 

‘I don’t like “to go”. I want to eat there [at the restaurant].’ 
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(2) Shut mai time-share? Namo gui. 

who buy time-share so expensive 

‘Who’d like to buy time-share? So expensive.’ 

In addition, both the children and the adults usually code-switched to English when 
they started to use a proper name that is associated with the United States, for exam¬ 
ple, a US product, a US basketball team, or the name of an American professor. Two 
of such CS are given in examples (3)—(4) below. 

(3) Wo jibenshang bu zenmo kan NBA. 

I basically not often watch NBA 
‘I don’t watch NBA games so often.’ 

(4) Mingtian ni qu bu qu Super K? 
tomorrow you go not go Super K 
Are you going to Super K tomorrow?’ 

Example (5), which was uttered by the child, and (6), produced by the adult, indicate 
that the children and the adults also both switched to English verbs or adjectives in 
their CS, although such instances are far less frequent than switching to nouns. In (5) 
the child used the verb ground, and in (6) the adult used the adjective competitive. 
Again, the Chinese counterparts of these expressions are either burdensome or dif¬ 
ficult to find. 

(5) WdngZhe bei ta bd ground le. 

Name passive his father ground perfective 

‘Wang Zhe was grounded by his father.’ 

(6) Ta laoshi shud tade chengji haisuan competitive, 

his teacher say his score close to competitive 

‘His teacher said that his score is close to being competitive.’ 

Examples (1)—(6) together point to another important common feature between the 
children’s and the adults’ CS. That is, in all these instances English elements fit in 
well with the syntactic requirements of the matrix language, Mandarin Chinese. This 
type of formal cohesion (Kachru 1983) is also reported by An (1985), which seems to 
confirm Poplack’s (1980) Equivalent Constraint that switching should occur at places 
where the syntactic structures of the matrix and the embedded language match. 

In terms of the functional domain of CS, both the children and the adults were 
found to have code-switched in all the identified broad functional domains, for exam¬ 
ple, in schools and at home. 
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4.2. THE DIFFERENCE BETWEEN CHILDREN’S AND ADULTS* CS 

4.2.1. the fewer structural constraints on children’s cs. Examination of the 
data demonstrates that, in one way or another, the children’s CS is subject to fewer 
formal constraints than their parents’. Some of the relaxed constraints are exemplified 
in (7)—(14) below. 

Examples (7)—(8) suggest that for the children switching from an English item, 
be it a verb or noun, to a Chinese particle is quite possible. In example (7) the child 
switched from an English main verb to the Chinese aspectual marker le, which indi¬ 
cates the perfection of an action. In example (8) the child switched from an English 
noun to the Chinese particle ne, which is used in this case to help establish a fact. In 
contrast, such switching is virtually impossible for the adults. 

(7) Mama yijing sleep le. 

mum already sleep perfective 

‘Mum has already gone to sleep.’ 

(8) Wo hai meiyou zud proofreading ne. 

I yet not do proofreading particle 

‘I haven’t yet done the proofreading.’ 

Examples (9)—(10) show that system morphemes in English—function words and 
inflections, which in this case mainly involve the complementizers if and that, also 
occur in the children’s Chinese-English CS. It should be borne in mind, however, that 
such deep borrowing (Myers-Scotton 1992) in the children’s CS is not as extensive as 
the switching to lexical items. In contrast, switching from Chinese elements to Eng¬ 
lish system morphemes is not observed in the adults’ CS. 

(9) Ta mama bii zhldao if ta wdncheng le zuoye. 

his mother not know if he finish perfective assignment 

‘His mother didn’t know if he had finished his assignments.’ 

(10) Dan ta mama xidngxin that he’s a good boy. 

but his mother believe that he’s a good boy 

‘But his mother believed that he is a good boy.’ 

Another related difference between the children’s and the adult’s CS is that in the 
adults’ CS, no English bound morpheme was switched together with a lexical item. In 
contrast, the bound morpheme was used together with the lexical item in the children’s 
CS. Examples of this type are in (11)—(12), produced respectively by the adult and the 
child. Although both types of switching satisfy Poplack’s Equivalent Constraint, for 
the children the syntax within the boldfaced noun phrase is English, while for the 
adults it is still Chinese, given that English generally marks plurality morphologically, 
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while Chinese does not. However, this does not mean that it is possible for the chil¬ 
dren to switch from Chinese directly to an English bound morpheme. 

(11) Zhe xueqi ni ydo mdi dudshao book? 
this term you need buy how many book 
‘How many books do you need to buy this term?’ 

(12) Wo yijing zudwdn jintian de assignments le. 

I already finish today modifier assignments perfective 

‘I’ve already finished todays assignments.’ 

Examples (i3)-(i4) below, where the matrix language in the code-switched sentence 
is English rather than Chinese, constitute one more piece of evidence that English is 
replacing Chinese as the childrens preeminent language. Myers-Scotton (1992:49) calls 
such a change, the change from the status of a guest to a host language, the ‘“outside 
goes to inside” change. She further offers a metaphor to describe this situation, recall¬ 
ing ‘the Russian fable of the wolf which ate the sleigh-horse and thereupon found itself 
in harness as a horse-substitute’ (citing Denison 1977:21). Yet again, no sentences code- 
switched in this manner are found to have been produced by the adults. 

(13) Td cengjing suffered from pneumonia, 

he ever suffered from pneumonia 

‘He’s suffered from pneumonia before.’ 

(14) Bushi everybody likes this teacher, 

not everybody like this teacher 

‘Not everybody like this teacher.’ 

In (13) the only two words in Chinese are the subject of the sentence td ‘he’ and the 
sentential adverb cengjing ever’. And in (14) the only Chinese element is the sen¬ 
tence-initial negation adverb bushi ‘not’. In a sentence that features CS, the criteria 
for determining the host language status are basically twofold: 1) the proportion of 
words expressed in a language; and 2) the syntactic functions that these words con¬ 
stitute 6 . Based on either of these two criteria, English is definitely the matrix language 
in sentences (13)—(14). 

4.2.2. the broader functional domains for children’s cs. The data analysis also 
indicates that, compared with the adults, the children code-switched in more func¬ 
tions, even though both the parents and children code-switched in all the identified 
broad domains. In other words, within these domains the children used CS in more 
sub-domains. For instance, CS was both reported to have taken place and found in 
children’s squabbles at home or at the playground, but no such linguistic behavior 
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was either found or reported to have occurred in the adults’ quarrels. Examples in this 
domain are provided in (15)—(16). 

(15) Ni buyao blame wd\ I did nothing wrong! 

you may not blame me I did nothing wrong 

‘Don’t blame it on me! I did nothing wrong!’ 

(16) Ni weishenmo bu gei wo mai roller skates!? Wode pengydu dou you ! 

you why not for me buy roller skates my friends all have 

‘Why don’t you buy me a pair of roller skates!? My friends all have them!’ 

In example (15) the child switched to the English word blame instead of using its Chi¬ 
nese counterpart piping ‘criticize’. In example (16) the child substituted the English 
word roller skates for the Chinese equivalent bingxie ‘roller skates’. More significantly, 
in (15) the second sentence ‘I did nothing wrong’ is completely in English. 

Labov (1970) notes that in an interview situation when the subject or topic creates 
strong emotions in the interviewee, he or she will shift from careful speech towards 
the vernacular, the type of speech that the interviewee feels most at ease with. Simi¬ 
larly, when the children were engaged in heated arguments or quarrels, they were 
unlikely to be aware of, not to mention be selective about, their language. Hence, the 
variety of code that they used in (15)—(16) is likely to be the one that they are the most 
comfortable with. In this sense, it is clear that the children’s mother tongue, Mandarin 
Chinese, is starting to decay, since the children cannot even produce a complete sen¬ 
tence in Chinese. More significantly, the decline is not confined to the lexicon, as is 
evidenced by the second sentence in example (15). 

One more strong piece of evidence that supports the claim that the three Chinese 
children are experiencing language shift is that in any transcribed tape-recorded 
conversation among the children and their parents in the domain of home, CS by 
the children occurred much more frequently than by the adults, if the adults code- 
switched at all. Although this study focuses on a qualitative analysis, the quantitative 
evidence would also be meaningful. 

5. THE SOCIAL AND INTERNAL FACTORS FOR LANGUAGE SHIFT. As Gal (1979) points 
out, language shift is an instance of socially determined linguistic change, which 
involves the redistribution of communicative forms over functions in everyday inter¬ 
action. Kulick (1992) also notes that language shift is a reflection of cultural change. In 
the case of these three Chinese children, then, it is only natural that when they moved 
from an Oriental to an Occidental culture that is predominantly expressed in English, 
their primary language would change, especially when their ability to acquire another 
language is still strong. 

More specifically, in the college town under study overall social factors are con¬ 
ducive to the Chinese children’s language shift. Most critically, these children attend 
schools where they receive education in English. Furthermore, many of the children’s 
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friends, both at school and at the playground, are from countries other than China. 
To communicate with these friends, the children may only use English. Although 
at home these children mostly speak Chinese with their parents, this fact may not 
successfully compete against the influence from the predominant use of English 
elsewhere in the children’s language development. Other factors, such as sending the 
children to the Chinese School, where they study Chinese over weekends, cannot 
prevent the language shift from taking place, either. 

Le Page and Tabouret-Keller (1985:14) view‘linguistic behavior as a series of acts 
of identity in which people reveal both their personal identity and their search for 
social roles’. In this sense, if the Chinese children attempt to create a modern identity, 
for example, an American identity, they will naturally choose to use more English. 
Wei (1994) also documents the choice of a certain language by the British Chinese 
to reveal a certain identity. Such an internal motivation, combined with the pressure 
from the external social context, constitutes a strong determinant for Mandarin Chi¬ 
nese to lose its primary language status to English in the three Chinese children. 

6. conclusion. The decay of migrant languages has been well studied. Nevertheless, 
there is a gap in terms of research on the fate of Mandarin Chinese as a migrating 
language. This study attempts to help fill this gap. Furthermore, the overall research 
on the linguistic consequences of the contact between English and Mandarin Chi¬ 
nese, the two languages with the largest number of speakers, seems to be insufficient. 
This study also attempts to contribute to this field of inquiry 7 . Given the fact that this 
study is in nature an apparent time study (Chambers 1995), the children’s attrition 
of Chinese and shift toward English might be difficult to perceive or describe. If the 
children’s CS behavior is studied longitudinally, their language shift can be more 
easily appreciated. 


I would like to express my gratitude to Rajeshwari Pandharipande and Rakesh Bhatt for 
their helpful comments and guidance in the course of my preparing this paper. I am also 
grateful to those who offered helpful comments and suggestions at the lacus conference. 
Precisely speaking, intrasentential codeswitching is referred to as code-mixing. In this paper 
codeswitching is used as a cover term for both inter- and intra-sentential codeswitching. 
Bokamba (1989) demonstrates that most proposed syntactic constraints on codeswitch¬ 
ing and code-mixing are cross-linguistically invalid. Even so, there do exist syntactic con¬ 
straints on codeswitching and code-mixing. The constraints proposed provide helpful 
references for the study of codeswitching involving different languages. 

The term language shift is sometimes used with two slightly different denotations. Fishman 
(1964) defines it as the replacement of one language by another in certain domains, i.e., the 
substitution of one language for another as the chief means of communication (Mesthrie et 
al. 2000). According to some other researchers (e.g., Fasold 1984), language shift means that 
a community completely gives up a socially dominated language to a dominating one. In this 
paper, language shift is meant to have the interpretation as in Fishman (1964). 

The sample questionnaire in the appendix does not represent the actual form of the 
questionnairre given to informants. It is instead an adaptation for this paper, with some 
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of the questions (e.g., the first question) somewhat abstracted. The aim in providing this 
appendix is to show what sort of data were being solicited. 

6 For a detailed discussion of the definition for the matrix and the embedded language, 
see Myers-Scotton (1993). Kamwangamalu and Lee (1991) is a discussion of the matrix 
language assignment specifically associated with Chinese-English codeswitching. 

7 Hsu (1994) is a comprehensive study of the influence upon the Chinese from the English 
language. Zhou and Feng (1987) includes a good discussion of the influence of Chinese 
language and culture upon English used in China. 

APPENDIX: SAMPLE QUESTIONNAIRE 

1. In which of the following domains have you ever code-switched between Chinese and 
English? 

1) at home 

2) at school 

3) in other settings (please specify) 

2. What would you say to express the following ideas, wants, or feelings? 

1) This is a beautiful apartment. 

2) Please tell me the meaning of this idiom. 

3) There are so many interesting books here! 

4) Anyway, in-class participation will also be taken into account to decide your final 
score. 

5) Like father, like son. 

3. There are at least two ways in which the following ideas or questions can be expressed. 
Which one do you prefer? (Expressions in italics are Chinese pinyin-, English expressions 
in the codeswitched sentences are in bold.) 

1) That professor is very nice. 

a. Nage jiaoshou ren hen hao. 

b. Nage jiaoshou hen nice. 

2) Is this dish good? 

a. Zhedao cai haochi ma? 

b. Zhege dish haochi ma? 

3) Anyway, lets eat. 

a. Buguan zenmeshuo, xian chifan ba. 

b. Anyway, xian chifan ba. 

4) Are we going shopping tomorrow? 

a. Mingtian women qushangdian maidongxi ma? 

b. Mingtian women go shopping ma? 

5) Two heads are better than one. 

a. Liangren zhihui shengyiren. 

b. Two heads are better than one. 

4. Are the following expressions acceptable to you? (Expressions in italics are Chinese 
pinyin ; English expressions in the codeswitched sentences are in bold.) 

1) Ta tiantian shuo tired, 
he every day say tired 
‘Every day he says that he’s tired.’ 
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2) Ba, wo sleepy le\ 
dad I sleepy Particle 
‘I’m sleepy. Dad.’ 

3) Swimming dangran bucuo. 
swimming certainly good 
‘Swimming is, of course, good.’ 

4) Suoyou de performance is considered in deciding your final grade. 

all particle performance is considered in deciding your final grade 

‘All kinds of performance are considered in deciding your final grade.’ 

5) Wo bu xiangxing that he’ll be elected President. 

I not believe that he’ll be elected President 

‘I don’t believe that he’ll be elected President’. 
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the phonology of the Celtic languages is marked by the disappearance of Indo- 
European /p/, as in Old Irish athir ‘father’ corresponding to Latin pater. This pro¬ 
ceeded through stages first with the change from /p/, possibly through /([>/ (McCone 
1996:44), to /h/ or l\l (Lewis and Pedersen 1973:27), and then finally to null. Within 
Celtic, some dialects developed a new /p/ from /k w /, as seen in such correspondences 
as Middle Welsh pymp ‘five’ and Old Irish coic (compare Latin quinque). 

1. ogam. The earliest attested form of the ogam (or ogham) writing system consisted 
of strikes across or emanating from the vertical edges of stone monuments in the 
British Isles beginning in the fifth century. Typically, it was inscribed from bottom to 
top, leading to the traditional rendition of the signary in Table 1 (overleaf). 

The conventional Roman representation is given to the right of the sign with a 
capital letter. The broad phonetic values are largely those determined by McManus 
(1991:36-39, with one adjustment, Z, reflecting ongoing research). 

There are three central points of general agreement within the profession on the 
ogam writing system that will be of importance here. These regard the origin of the 
system, the phonetic basis of the signs, and the dialect variation of Celtic. 

The signary was quite transparently derived from some sort of tally system (compare, 
for example, Gerschel 1962 and McManus 1991:14-15), and such tally systems composed 
of strikes across a line are in evidence in the British Isles from the Upper Palaeolithic (as 
illustrated in Barham et al. 1999:80,102). Given recent dna evidence for the stability of 
the population from the Palaeolithic (Barham et al. 1999), the fact that this type 
of tally system appears and reappears is significant for the development of ogam, for 
it implies a consistent, persistent culture. 

One of the most strongly held tenets in the study of ogam is that the various 
columns were arranged in accordance with phonetic principles. Of course, these 
principles would not necessarily have adhered to our current view of feature or com- 
ponential phonetics; but they would have proceeded within the system along rational 
sound-related parameters. Within this vein, there is also concurrence that the motiva¬ 
tion for setting the tally system down as a form of writing was probably influenced by 
contact with the Roman alphabet. Certainly, ogam’s appearance on stone monuments 
has been seen as an imitation of Roman grave markers. Although once somewhat 
strongly held, the idea of a direct influence by such Roman grammarians as Dona- 
tus on the phonetic array of the system is now doubtful both from the array itself 
(McManus 1991:28) and from considerations of dating (Stevenson 1990:165). We must 
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Table 1. The ogam alphabetic signary. 

also bear in mind that while our earliest physical evidence of ogam is found on these 
stone monuments, the signs had already been incised for some undetermined length 
of time on wood, which has long since decayed. 

Finally, there is growing agreement on the dialect diversity of Celtic throughout 
the entire region of Western Europe and in the British Isles themselves. The neat pro¬ 
gressions of /p/ to null overall and of /k w / to /p/ in precise regions have been severely 
contradicted by such evidence as place names. Indeed, evidence of original /p/ surviv¬ 
ing relatively late in some dialects identified as Celtiberian (Rankin 1987:24) and a 
mixture of /k w / and derived /p/ on the Coligny Calendar and throughout Gaul and 
Spain (see Rankin 1987:14, 23) indicate that the situation was far more fluid than we 
may have previously believed. 

2. reconstructing *p. With the ordering of the ogam signary in conjunction with 
the tally system, it should be rather clear that the phonetic value of each sign would 
have already been in use as a mnemonic device for identifying the numerical value. 
That is, a reference to + would have been made by using a word beginning with A, 
and a reference to 'f- by using a word beginning with M, and so forth. In fact, there are 
sets of words and kennings that were used in just this manner. Such sets would have 
been quite necessary for those changing the tally system into an alphabetic signary, so 
that others would readily understand which sound was being signaled. 

While it is generally assumed that the ogam system of writing developed in Ireland, 
no stone inscription there contains the sign H or H (nor Z—Macalister 1945 :v; and some 
contend that GG occurs too infrequently for precise phonetic determination—Gippert 
1990:291). The word usually associated with the sign H is the Old Irish hUath ‘hawthorn 
in which the initial /h/ had already been reduced to null. In all such cases as that of 
the mnemonic word, the reflex of /h/ would have been a silent grammatical entity and 
would not have been represented in the writing. Certainly as regards this sign, the pho¬ 
netic value associated with the tally mark necessarily predates Irish. 

One word associated as a kenning with H is Old Irish uath ‘fear, horror’ (compare 
also Merony 1949:28) corresponding to Latin pavere ‘to be terrified’, a point made by 
Peter Schrijver and reported by McManus (1991:37). McManus, however, dismissed 
the speculation that H might have been derived from /p/, as that would have rep¬ 
resented a linguistic situation much too early for the monumental inscriptions. On 
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the other hand, the monumental stone inscriptions followed a tradition of unknown 
duration in which the ogam inscriptions were carved on now-perished wood. 

That H originally represented the sound /p/ is precisely what is argued here both 
from the internal structure of the ogam signary itself and from comparative evidence 
of Irish and Pictish. 

2.1. internal evidence. Let us begin with an examination of the sounds in the ogam 
signary’s array. Starting with the vowels, we note that the base value of the fourth 
column is A. The next two signs proceed quite rationally up in the back of the oral 
cavity from O to U. We then change to the front of the oral cavity and proceed in 
precisely the same manner from E to I. The idea that the first two values ‘above’ the 
base value progressed in one manner and that the next two progressed in another, but 
related manner is by no means new, having been pointed out, for example, by Carney 
(1975:54-61). Moreover, the A represents an extreme and unique position of articula¬ 
tion and the only point at which we may enter the vocalic triangle in the oral cavity in 
such a way as to effect the pattern in which two values follow two values. 

The consonantal base values in the first and third columns are B and M, which 
are produced at the labial position of articulation. Parallel with the A in the vocalic 
column, these signs thus provide an articulatory base position that may be considered 
as a logical starting point to the oral cavity. This is one reason for reconstructing the 
value of -\ as *P—now all of the base values are phonetically consistent in a means 
that could have occurred to reasonably intelligent people at the time, as they represent 
readily identifiable positions. 

Proceeding up the columns, we find that the ^P is indeed a valid fit. Very briefly, the 
consonants appear to be grouped by a perception of ‘hardness’ and complexity’. The 
B column starts with the softest, most liquid pair and ends with somewhat harder 
continuants (with each pair marked or ‘complicated’ by a retraction of the tongue for 
the second member). The M column starts where the B column leaves off, with the 
nasal continuant relating to the voiced stop by complexity, which is made more com¬ 
plex in the continuant off-glide; and likewise, the single affricate is complicated in the 
continuant trill. The *P column also starts where the B column leaves offwith a voiced 
stop which is hardened to the homorganic voiceless stop. Within the column itself, 
the same pattern continues with a voiced stop hardened to its homorganic voiceless 
counterpart; and the next voiceless stop is complicated with the off-glide. 

Now, the soft labial B introduces the soft column, the hard labial *P introduces the 
hard column, and the complex labial M introduces the complex column, just as the base 
vowel A introduces the vowel column. This reconstructed *P thus fits the phonetic valu¬ 
ation of the system quite precisely and in a highly consistent pattern; the H or null value, 
on the other hand, clearly does not. Indeed, *P provides a consistent base value for its 
column, as it serves as a basis for the progression up the columnin modern terminology, 
it shares ‘distinctive features’ both within its ‘order’ and within its ‘series’. 
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2.2. comparative evidence. Not only is the signary better served with the *P than 
with the H, but the reconstruction of *P fits into comparative evidence from Pictish, a 
Brythonic Celtic dialect grouping in which /k w / had already changed to /p/ (see For¬ 
syth 1997). We must recall, though, that the dialect situation among the Celts was not 
subject to clear geographical divisions reflecting a limited set of variations. Rather, it 
was a hodgepodge of diversity, in which one dialect might be more conservative than 
its neighbor in one respect and more innovative in another. 

While Pictish ogam is frequently problematic, one of the names that stands out 
very clearly and that is corroborated by the king lists is traditionally rendered as 
NEHT- or NEHHT-, the designation for Nechtan (variously spelled). 11 ere the tra¬ 
ditional ogam sign -j (or —| H)—H (or E 1 H)—was definitely used and represented a 
spirant /y/ related with the aspirate /h/ both in sound and evidently in the percep¬ 
tion of the speakers. 

As it were, NEHT- is cognate with the root NET- ‘grandson, nephew, descendant’ 
found in Irish ogam inscriptions. Of course, both terms call to mind Latin nepos 
‘grandson, nephew’ with its root nepot- (compare McManus 1991:100). Indeed, this 
was also used by Latinized Celts as a name—compare Cornelius Nepos of Cisalpine 
Gaul (Rankin 1987:106). (The name may possibly be related with NETTAS in Gaul¬ 
ish—Evans 1967:369-70.) 

In these three forms, then, we see the historical progression noted at the beginning 
of this paper in which Indo-European /p/ (as in Latin) changed first to a spirant that 
could ultimately be realized as /h/ or /y/ (as in Pictish) and then to null (as in Irish). 
The realization of /y/ in Pictish is quite in accord with more general changes from 
the Indo-European /p/ before It/ (compare Lewis and Pedersen 1973:27); and indeed, 
where the /t/ followed directly, the /y/ was retained in Irish, as in nechta ‘granddaugh¬ 
ter’ (compare Latin neptis). In keeping with the diverse nature of Celtic dialects, Pic¬ 
tish had long retained the reflex spirant but had already changed the /k w / to /p/, there 
being no longer any competing sound. 

Insofar as the rendering of the ogam signary is concerned, then, Pictish H would 
indeed be historically appropriate for the reconstructed *P. The Irish -j had lost its 
pronunciation and the sound /y/ was represented in appropriate environments as 4 
or C (single or doubled), as we find, for example, in the name CARRTTACC Carthach 
(McManus 1991:124). On the other hand, Pictish -j still represented /h/ or /y/. Thus, 
the comparative evidence verifies the status of H as a reflex of *P in the Pictish ogam 
signary. This opens the door for *P in the original tally system upon which the ogam 
alphabetic signary was imposed, if not on the original alphabetic signary itself. 

3. implications. The Celtic cultures were to one degree or another marked by the 
synthesis of the pre-Indo-European cultures of the Atlantic region—those that would 
culminate in the megalithic cultures—and the cultures of the Indo-Europeans. Cer¬ 
tainly the most prominent of these latter peoples were those we now identify as the 
Beaker Folk. In the introduction of the Beaker Culture, however, there was a ratio of 
population impact to cultural influence that gradually tilted in favor of the latter in 
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the progression westward into the British Isles—in much the same manner as had 
occurred (verified by archaeological and biological evidence) in the earlier exten¬ 
sion of Neolithic farming techniques from Anatolia (compare Sherratt 1997:22). For 
example, the significant changes to the grave goods included in beakers in the North 
Atlantic basin (ibid 387) imply that the Beaker Culture was only partially adopted. 
In particular, the evidence from what was to become Pictland strongly suggests the 
adaptation of Beaker burial practices with major modifications in keeping with tradi¬ 
tional local practice (see Ashmore 1996). 

When we consider the Celts (particularly the Insular Celts), we are thus faced with 
an amalgam of cultures in which the conservative, pre-Indo-European elements were 
particularly prominent. One thing that the presence of arrow heads and the absence 
of javelin heads does not tell us, however, is how the indigenous languages of the 
region differed from those of the Indo-European Beaker Folk. Indeed, we could take 
the extreme position of Colin Renfrew (1987) and suggest that the Indo-European 
languages had spread during the Neolithic agricultural revolution, millennia before 
the arrival of the Beaker Folk with their somewhat related language. 

Moreover, while we can trace the progress of various strains of dna across Europe 
in a number of different patterns and can use these patterns to help determine move¬ 
ments of peoples, this evidence tells us nothing about the languages these peoples 
spoke (compare Cavalli-Sforza 2000). One of the most enlightened observations 
made by modern linguists is that there is no necessary connection between language 
groupings and genetics. 

Within this framework of uncertainty enters the problem of ogam *P. The fun¬ 
damental distinction between Celtic and the other Indo-European families is the 
absence of /p/ in Celtic. Yet, both the structural evidence of the ogam signary and 
the comparative evidence of Irish and Pictish inscriptions inexorably point to H as in 
fact deriving from an original *P /p/. Furthermore, the very nature of the tally system 
from which ogam was derived indicates an indigenous, pre-Indo-European origin. 

How, then, are we to classify ogam? In particular, just what was the language that 
provided the mnemonic phonetic values for the ogam signary? Was it pre-Indo-Euro- 
pean; was it pre-Celtic; or both? Or shall we say that Celtic itself—in language and/or 
in culture—stretched back much further into the past than we may feel comfortable 
with? Indeed, as recently proposed by Simon James (1999), should we be examining 
the Atlantic Celts’ with their ogam signary as a separate group altogether—a group 
with many diversities of its own? Moreover, just how much of that diverse cultural 
group might we consider to be Indo-European or pre-Indo-European, however we 
may now wish to interpret that distinction? 

In any case, it is time for linguists to address the evidence of the past quarter- 
century in archaeology and in biology and to start asking such difficult questions. 
In the wake of the evidence, we need to make some major reassessments. And some¬ 
where within these reassessments lurks the problem of ogam *P. 
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DURATIONAL PROPERTIES OF SYLLABLES AS POTENTIAL 
EVIDENCE FOR RHYTHMIC PATTERN IN L 2 ACQUISITION 


Christian Guilbault 
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recently, much time and effort has been devoted to understanding the phe¬ 
nomenon of foreign accent in adult L2 acquisition. As a result, it is now possible for 
researchers to predict assimilation patterns of new phonemic contrasts (Flege 1995, 
Best 1995). However, experimental studies on the nature of a possible prosodic accent 
are still too few and far between. This is somewhat paradoxical, since the effect of 
non-native prosody is considered by many to be of crucial importance for accurate 
production of individual phones, for the expression of emotions and for proper pars¬ 
ing and processing of the speech signal by listeners. This study presents the results of 
two experiments which examined the rhythmic temporal pattern of French as pro¬ 
duced by native speakers of English learning French. 

1. the problem. Rhythm is central to the prosodic structure of speech. The work of 
Pike (1945) and Abercrombie (1967) has led to the traditional distinction between 
stress-timed and syllable-timed. Languages that belong to the former category exhibit 
equal temporal distance between stresses and languages that belong to the latter cat¬ 
egory exhibit equal temporal distance between syllables. A third category was later 
proposed for Japanese and Tamil, which exhibit moraic rhythm. The most important 
consequence of stress-timed languages like English, according to this proposal, is 
the necessity to either compress or lengthen syllables in order to make stress groups 
approximately equal in duration. By definition, syllable-timed languages, like French, 
exhibit constant syllable duration (except for group-final syllables, see Wenk & Wio- 
land 1982). Measurement studies have failed to provide empirical evidence of strict 
acoustical isochrony (Dauer 1983, Roach 1982, Wenk & Wioland 1982 among others), 
and most authors now consider these labels as tendencies rather than categories. These 
studies also led to the proposal that the two opposing rhythmic types be replaced by 
a continuum with prototypical stress-timed and syllable-timed languages located at 
opposing ends. Other languages would be located on this continuum according to 
their approximate resemblance to either type (Dauer 1987). Despite the interest of 
such a proposal, the role played by temporal patterns in a language and their impor¬ 
tance in locating a specific language on this continuum is still unclear. 

2. goal of the current experiment. The primary goal of the current research is to 
investigate the use of duration as a primary property of the syllable and its ability 
to account for rhythmic properties of languages. More specifically, I expect to shed 
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French sentences 

English sentences 

Luc a mange et sest endormi. 

Luke has been gone and has not been seen. 

Luc qui a mange sest endormi. 

Luke who has been gone has not been seen. 

Luc qui aurait mange se serait endormi. 

Luke who would have been gone would have 
been seen. 

Pierre qui a peint travaille beaucoup. 

Claire who has painted travels a lot. 

Pierre qui nous a peint travaille beaucoup. 

Claire who should have painted travels a lot. 

Pierre qui nous a bien peint travaille beau¬ 
coup. 

Claire who should have been painting works 
a lot. 

Claude ha pas vu Marie a la fenetre. 

Jim has not seen Mary at the window. 

Claude n’aurait pas vu Marie a la fenetre. 

Jim would not have seen Mary at the 
window. 

Claude ne l’aurait pas vu a la fenetre. 

Jimmy would not have seen Mary at the 
window. 


Table 1. Sample of English and French corpora. 


some light on the phenomenon of L 2 acquisition of the temporal rhythmic structure 
of French by native speakers of English. A secondary goal of this research is to gather 
empirical evidence related to the acquisition of suprasegmental features of a language. 
The acquisition of new phonemic segmental contrasts is fairly well documented, but 
experiments investigating the acquisition of suprasegmental features are still infre¬ 
quent. In this paper, I report on two experiments which examined the rhythmic tem¬ 
poral pattern of French as produced by native speakers of English learning French. 

3. methodology. This study involved twelve participants. Among these were six 
learners of French and six native speakers of French who were recruited from the 
student and staff populations at the University of Alberta. The learners of French 
were divided into two subgroups according to their oral proficiency and their overall 
experience in French 1 . Less experienced learners (eli) had between 7 and 13 years 
of exposure to formal instruction in high school or university. More experienced 
learners had in general spent extensive immersion periods in French-speaking 
environments either in Montreal, Canada, or in Paris, France. All native speakers of 
English but one were native speakers of Canadian English. Native speakers of French 
were subdivided into two smaller groups, based on their dialect: 3 native speakers of 
Canadian French (cf), 3 native speakers of European French (ef) 2 . Both groups were 
considered representative of their respective dialect. 

Speech samples consisted of recalled single-sentence utterances. During the 
recording session, participants were presented with a single sentence on a computer 
screen. They were instructed to read the sentence and then to repeat it while facing a 
blank page. Due to the relatively high number of sentences to be uttered, no detrac¬ 
tors were used. The second utterance of every sentence was used for measurements. 
All speakers read the same list of sentences presented in a different random order. 
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The stimuli were single sentences in French which included one relative clause 
varying in length. Given that French has a group-final primary stress, it was 
expected that this clause would have only one primary stress mark and, hence, 
would constitute a single rhythmic group 3 . Six sentence groups were conceived 
involving words with different syllable structures (see Table 1 for a sample list of the 
analyzed sentences). A similar corpus was formed in English with sentences that 
have a similar number of syllables, similar syntactic structure, and, in some cases, 
comparable syllable structures. 

Syllable durations were measured within the embedded clause of each sentence. 
Measures were taken from the onset of the first segment—the nucleus or the first C 
in onset position—to the beginning of the next syllable at the zero-crossing point. 
The very rare hesitations and silent pauses were removed from the measurements. The 
segmentation process of the French corpus into syllables was done according to the 
principles proposed by Delattre (1940). Most syllables in the French corpus had a 
CV or CVC structure. The syllabification of the English corpus was also relatively 
straightforward because of the nature of the corpus. Intervocalic consonants were 
assigned to a syllable following the maximal onset principle. Most syllable structures 
in the English corpus were also CV or CVC. 


4. EXPERIMENT 1. 

4.1 variability index. The goal of this first experiment was to determine if English 
learners of French exhibit more syllabic durational variability in a production task 
than native speakers of French. In order to measure this variability, the index used 
by Deterding (2001) was chosen. This index measures variations in syllable duration 
compared to an average syllable duration. All syllables within the relative clause were 
measured. Following Deterding, it was decided not to include the final syllable of 
this clause in the measurements. The motivation for this decision is that measuring 
group-final syllables, which usually bear primary stress in French, would introduce 
an important amount of variation in the calculations if produced differently across 
groups of participants. Among all 30 sentences recorded for each participant, only 
the last three sentences of each sentence group had the three syllables minimally 
required for the computation of the index (total of 18 sentences for each participant). 
Normalized durations were used in the computation in order to neutralize variations 
in tempo. The formula in (1) shows the computation of the Varlndex: 


( 1 ) 


Varlndex = 


zu, 


k -1 


/ (n-1) 


d k = normalized duration (duration of a syllable divided by average 

duration of all measured durations of a phrase) of the kth syllable, 
and 

n = number of syllables 
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Speaker Groups 

Varlndex 

French Corpus 

Standard 

Deviation 

English 

0.3601 

0.1995 

ELI 

0.4056* 

0.2486 

EL 2 

0.4298* 

0.2399 

CF 

0.3508 

0.2042 

EF 

0.2866* 

0.1547 


Table 2. Varlndex for all speaker groups. (*eli and eh are statistically significant 
when paired with ef. ) 

According to this Varlndex, perfect syllable isochrony would allow for no durational 
variability, hence leading to a Varlndex of o. The greater the variability in syllable 
duration, the greater the index. Digitization (20 kHz) and measurements were done 
using Computer Speech Laboratory by Kay Elemetrics Corporation model 4300. 

Earlier studies have considered English and French as prototypical examples 
of stress-timed and syllable-timed languages. Therefore, it is expected that syllabic 
durational variability as measured with the Varlndex will be greater in English than 
in French. Previous research in L2 acquisition has found evidence which suggests that 
L2 learners gradually acquire the rhythmic properties of an L2 (Wenk 1986). Hence, 
English learners of French are expected to exhibit more variability in syllabic dura¬ 
tion in French than native speakers of French. Moreover, the amount of inter-syllabic 
variability should decrease as English speakers become more proficient in French. 
Finally, one more hypothesis can be made based on the differences reported between 
CF and EF (Armstrong 1999, Paradis & Deshaies 1990, Ouellet & Tardif 1996 among 
others): CF speakers will exhibit more inter-syllabic variability than EF speakers. 

4.2. results. The Varlndexes for all groups of speakers are given in Table 2 and Figure 1. 
The results partially validate the first hypothesis, as native speakers of English produced 
a greater Varlndex than native speakers of European French (ef). However, the index 
displayed by native speakers of English in their mother tongue, 0.3601, is noticeably 
lower than the ones reported by Deterding (2001). In his study, indexes of 0.448 and 
0.543 were reported for British and Singaporean speakers respectively, with standard 
deviations of 0.164 and 0.172. This discrepancy between the results of the present 
research and Deterding’s are attributed mainly to the different experimental tasks. It has 
been claimed that read speech in French (which resembles the recall speech used in the 
current experiment) is characterized by a tendency to regularize inter-syllabic intervals, 
whereas spontaneous speech favors the production of breath groups of equal length 
(Vaissiere 1991). In addition, the unusually low number of syllables with complex onsets 
or codas in our corpus of English (most syllables are CV or CVC) may have contributed 
to lower inter-syllabic variability. Contrary to the first hypothesis, however, durational 
variability displayed by cf speakers is noticeably greater than the index displayed by ef 
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Figure 1. Varlndex for all speaker groups: Intermediate English learners of French 
(eli). Advanced English learners of French (eh), native speakers of Canadian French 
(cf), and native speakers of European French (ef). 



RG 3 RG 4 RG 5 

Figure 2. Varlndex for each sentence within each rhythmic group. 

speakers, and similar to the one of English speakers. There is no immediate explanation 
for this surprising result. It is possible that the index used in this study did not capture 
the more subtle differences between the three groups of speakers, or that Canadian 
French is becoming a quantity-sensitive language (Armstrong 1999), thereby resem¬ 
bling to English more than to European French. This hypothesis will have to be investi¬ 
gated more in depth in further research. 

The second hypothesis for this experiment is confirmed. In general, English L2 
learners of French produced greater Varlndex and standard deviations in the target 
language than native speakers of French. Results from a two-way anova revealed 
a significant Speaker Group effect (p<.05). A Post-hoc (Tukeys hsd) analysis indi¬ 
cated that eli and EL2 are significantly different when paired with ef but not when 
paired with cf. The third hypothesis, which assigned significantly lower indexes to 
more advanced learners of French, was not confirmed by the post-hoc analysis. This 
unexpected result suggests that L2 learners of French did not make significant prog¬ 
ress after several years of exposure to French. It appears that despite their greater 
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experience, the EL2 speakers have not acquired a native-like temporal rhythmic 
structure in French. 

A closer look at the results of the anova revealed a second interesting effect. The 
analysis characterizes the main effect of Sentences nested within Rhythmic Group as 
significant (p < .05). This result indicates that some sentences were produced across 
all groups of speakers with a significantly greater index than other sentences. Among 
these sentences, RG3S1, RG4S1, and RG5S1 may have a higher index because they include 
a short vowel immediately followed by a syllable with a long French nasal vowel (Luc a 
mange). Similarly, the presence of a complex onset like /br/ or /bj/ when preceded by a 
syllable which has no onset nor coda, found in RG4S5 (Rome qui a brule serait detruite ) 
and RG5S4 (Paul avait bien aime la chasse ), generated a high Varlndex. 

These results suggest that duration provides consistent information regarding 
the temporal structure associated with French rhythm. It is still undetermined if the 
absence of significant difference between both groups of learners should be attributed 
to the limits of the index or to the possibility that EL2 speakers may display fossiliza- 
tion. The results also strongly suggest that specific phonemic factors must be consid¬ 
ered in an account of syllable variability. 

5. experiment 2: language-specific phonemic properties. The previous experi¬ 
ment rested on the assumption that all speakers, including L2 learners of French, 
followed similar rules for segmenting the signal into syllables. Previous research (Beau¬ 
doin 1996) has shown that English learners of French, especially in the earlier stages 
of acquisition, exhibit mixed syllable structures. A proper account of L2 rhythm must 
consider this important variable. The goal of this second experiment is to confirm the 
tendencies identified in experiment 1 when syllable structure is not considered. 

Measurements for this experiment are similar to the one proposed by Ramus, 
Nespor & Mehler (1999), who proposed a simple measure of the duration of vocalic and 
consonantal intervals and then respective standard deviations. These intervals consist 
of the unaltered total duration of the segments in a given sentence. In their paper, the 
authors argued that this basic phonetic account of the temporal structure of a sentence 
reflects the phonemic properties of a language, such as syllable structure and vowel 
reduction, for instance. They further explain, following a proposal made for the first 
time by Dasher and Bolinger (1982), that these language-specific phonemic properties 
are responsible for the perception of different distinct rhythmic classes. 

The following experiment attempts to show that language-specific phonemic prop¬ 
erties are responsible for different rhythmic temporal structures. English and French 
differ in a number of phonemic properties, among which the most noticeable are the 
syllable structures (greater use of CVC in English, CV in French), the existence of 
vowel reduction in English, and the presence of lexical stress in English (Dauer 1983). 
These phonemic properties should be reflected in the phonetic structure of the utter¬ 
ances, which will be measured in the vocalic and consonantal intervals. 

The specific hypotheses for this experiment are: a) L2 learners of French will pro¬ 
duce a greater percentage of vocalic intervals in English than in French; b) L2 learners 
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of French will produce smaller vocalic intervals in French than native speakers; c) 
standard deviations for consonants and vowels will be greater for learners of French 
than for native speakers of French; and d) more proficient learners will exhibit inter¬ 
vals closer to native speakers’. 

5.1. methodological considerations. The data used for this analysis were taken 
from the same corpus used in the first experiment. Measurements were taken only on 
the last sentence of each sentence group, however. Contrary to the first experiment, 
the entire sentence was used for the analysis. Once more, the last syllable of the sen¬ 
tence had to be excluded from the calculations mainly because it was on many occa¬ 
sions almost inaudible and impossible to measure with accuracy. 

Following Ramus, Nespor and Mehler (1999), vocalic intervals of the recalled sen¬ 
tences are defined as all vowels located in between two consonants, and consonantal 
intervals are formed by all consonants located between two vowels. The sentences 
were segmented as in (2). 

(2) Rome qui aurait bride serait detruite. 

/r->mk-i->r-e-br-y-l-e-s-3-r-e-d-e-(tr- qi-t)/ 

In these examples, intervals are separated by and the omitted sentence-final syl¬ 
lable is in parentheses. The authors provided three variables in their study: 

1. Percentage of vocalic intervals in the entire sentence (%V), computed by 
dividing the sum of all vocalic intervals by the total duration of the sentence 
and multiplying it by 100; 

2. Standard deviations of vocalic intervals within each sentence (AV), and 

3. Standard deviations of consonantal intervals within each sentence (AC). 

The sum of all consonantal and vocalic intervals should be identical to the duration 
of the entire sentence. 

5.2. results. Table 3 and Figure 3 (overleaf) present the results across all groups of 
speakers. As predicted by the first hypothesis, English speakers exhibit noticeably 
greater %V in English than in French, thereby supporting the first hypothesis. The 
ratios in this second experiment are in general greater than the ones reported by 
Ramus et al., who found proportions of vocalic intervals for English and French of 
40.1% and 43.6% respectively. These discrepancies are, once again, attributed to the 
constitution of the English corpus, which is not considered similar to free speech. As 
mentioned before, there is an insufficient amount of complex onsets and codas com¬ 
pared to CVs and CVCs, and this would give a higher vowel interval percentage. 

The analysis of the French corpus produced by all groups of speakers partially 
confirmed the experimental hypotheses. Contrary to the second hypothesis, the pro¬ 
portions of vocalic intervals produced by eli is noticeably greater than those of EL2 
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# of V 
Intervals 

# of C 
Intervals 

%V 

AC 

AV 

ELI 

94 

93 

57.81 

0.064 

0.069* 

EL2 

93 

93 

53-09 

0.057 

0.047 

CF 

95 

95 

53-83 

0.058 

0.045 

EF 

99 

98 

54.98 

0.058 

0.038* 

total: 

381 

379 

— 

— 

— 

English 

225 

231 

47.21 

0.057 

0.046 


Table 3. Number of consonantal and vocalic intervals measured, vocalic intervals and 
standard deviations across all speaker groups. (*Significant difference as determined by 
a Tukey (HSD) post-hoc analysis.) 
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Figure 3. Distribution of speaker groups over the %V, AC (left) and (right) AV planes. 


and both groups of native speakers of French. Unfortunately, the two-way anova 
did not reveal any significant difference between these ratios. Standard deviations for 
consonants (AC) and vowels (AV), however, did confirm the experimental hypoth¬ 
esis, as greater values were displayed by both groups of learners of French (eli, el2). 
Contrary to the results reported by Ramus et al, and as predicted by hypotheses (c) 
and (d), the standard deviation of vocalic intervals (AV) proved useful in discrimi¬ 
nating between speaker groups. Speakers who belong to eli exhibited the greatest 
standard deviation (0.069), EF displayed the smallest deviations (0.038) and EL2 and 
cf presented intermediate variations (0.047, 0.045). This difference between eli and 
ef was declared significant, in a one-way anova (p < .05). The standard deviations 
associated with consonantal intervals were not significant in this study. 


6. discussion. This study examined the hypothesis that duration as a fundamen¬ 
tal property of the syllable provides an account of English and French rhythm. It 
differs from previous studies, since it compares data about adult L2 learners and 
native speakers of French. The first experiment used a variability index to determine 
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if variations in syllabic duration are consistent with previous accounts of English and 
French rhythm. Moreover, it was predicted that English L2 learners of French would 
display more syllabic variability in French than native speakers. The confirmation of 
these hypotheses strongly suggests that duration as a fundamental property of the 
syllable must be part of a proper account of rhythm. Even though these results are in 
agreement with the controversial notion of isochrony, the analysis also suggests that 
a more detailed investigation is required in order to explain the effects related to the 
quality of the segments and syllable structure. 

An attempt was made with the second experiment to examine more accurately 
the effect of phonemic properties of both languages and their role in the account 
of rhythm. The tendency for learners and native speakers of French to produce 
identical relative clauses with different syllabic temporal properties measured in 
the first experiment was confirmed. These different acoustic properties displayed 
by learners of French strongly suggest that, at least in the early stages of acquisition, 
learners have not fully acquired the phonological properties of the target language. In 
the acquisition of French by native speakers of English, the phonetic characteristics of 
vowels seem to be more challenging than consonants. These experiments raise many 
interesting questions regarding the nature of linguistic rhythm and its acquisition. 
For instance, what exactly are the language-specific phonemic properties which con¬ 
tribute to the perception of rhythm? Which of the phonemic properties are acquired 
during the acquisition of an L2? Are some languages more difficult to acquire by L2 
learners than others because of the complexity or nature of their rhythmic properties, 
and if so, why? 

The results obtained in the study presented here certainly highlight the importance 
of duration as a fundamental property of French rhythm, ffowever, more empirical 
evidence from cross-linguistic studies is needed to confirm the importance of phone¬ 
mic properties of vowels in L2 acquisition as observed in this research. To confirm this 
will require a larger sampling of the population and a broader sampling of languages 
and speech material. Regardless, the results of this study does not support the strong 
position which attributes the perception of rhythmic differences between languages 
solely to isochrony. Instead, the results presented here suggest that the phenomenon 
of rhythm would be better understood if analyses of languages’ segmental properties 
and phonetic aspects are included. 


This initial classification has been confirmed by a perceptual experiment. In this experi¬ 
ment, native speakers of French subjectively classified these learners into two distinct cat¬ 
egories. The intraclass correlation (Shrout 1995) between all listeners was high at 0.895. 
The term ‘European French’ refers to speakers who speak a variety of French with no 
traces of regional accent perceived by the main experimenter. 

A rhythmic group —or groupe accentuel —in French is a series of unstressed syl¬ 
lables followed by and including one syllable bearing primary stress (Lacheret-Dujour & 
Beaugendre 1999:45). 
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ON THE SUPERIORITY OF TONETIC OVER SEGMENTAL PHONETIC 
EVIDENCE: AN ORIGINAL REANALYSIS OF KHOEKHOE TONE 


Roy S. Hagman 
Trent University 


of all forms of linguistic evidence, phonetic evidence has always had the repu¬ 
tation of the being the ‘hardest’ and most objective. The whole scientific edifice of 
structural linguistics was founded upon the corpus of phonetically transcribed lin¬ 
guistic data to which the techniques of analysis could be applied. The core of the pho¬ 
neticians’ methodology was a system of alphabetic representation then already 3,000 
years old, newly reinvented and universalized on a solid basis of articulatory observa¬ 
tion. Though, on a theoretical level, linguists gradually weaned themselves from the 
alphabetic form of representation, on the practical and didactic levels it is still very 
much alive and used. 

Through a combination of historical accidents, however, the representation of the 
non-segmental aspects of linguistic sound, and tone in particular, had been sadly 
neglected from the beginning and was slow to benefit from the new articulatory sci¬ 
ence. The languages for which alphabets were invented tended to be non-tonal, and the 
Asian languages for most of their history used only ideographic systems. If it were not 
for the efforts of Byzantine Greeks attempting to represent the pitch-accent of ancient 
Greek with marks above the vowels, even accent marks would likely never have been 
invented. These, however, gradually lost their association with pitch properties, an asso¬ 
ciation which had to be rediscovered by modern scholars (Stanford 1967). 

Because available phonetic alphabets had no reliable way to indicate tone, early 
modern investigators of tone languages had to improvise, inventing their own idio¬ 
syncratic methods for observing and recording the tones of the languages they were 
studying. Their efforts ranged from impressionistic drawings of tonal contours to the 
detailed mapping of the pitch of the voice against a grid of frequencies or musical 
pitches. The articulatory science developed during the 19th and early 20th centuries 
was of little help to them for the simple reason that the glottal tightening and thicken¬ 
ing responsible for changes in the fundamental pitch of the voice was not for them an 
observable phenomenon. They were, in effect, forced to do an early, non-instrumental 
form of acoustic phonetics. 

When properly done, tonetic transcription had the distinction among all forms of 
phonetic transcription of having been done by means of an entirely extra-linguistic 
form of measurement and so was not subject to the limits imposed by an available 
inventory of phonetic symbols. A transcriber with a good ear for pitch could record 
the continuous rise and fall of the fundamental pitch of the voice with reliable accu¬ 
racy. When the pitch meter first made its appearance at the mid-century, the tracing 
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ia. less high-rising 


6. low-mid level 


2. mid-rising 

3. low-rising 

4. high-falling 

5. mid-falling 


1. high-rising 


c' > d' 
a# > c 
g > b 

f > f# 
c > b 
a# > g# 
g# > g# 


Table 1. Khoekhoe tonal contours. 

of pitch became mechanized, but the complex nature of vocal tone made it only a 
moderately reliable device. The most modern phonetic software now makes it pos¬ 
sible to do entirely reliable tracings of vocal pitch by following the entire overtone 
series of the vocal buzz and using this information to compute fundamental pitch. 
The pitch tracings that such modern devices produce are, however, very similar to 
what can be done with the human ear, whose neural machinery no doubt computes 
fundamental pitch in very much the same way. 

An example of good early tonetic recording and its advantages that I would like to 
examine is the work of D.M. Beach, who in the 1930’s published his classic work The 
Phonetics of the Hottentot Language (Beach 1938), which included a careful tonemic 
analysis of what we now refer to as the ‘Khoekhoe’ language, a major indigenous lan¬ 
guage of Namibia best known for its extensive inventory of click consonants. Beach’s 
work on Khoekhoe tonetics was inspired by Karlgren’s work on Chinese and included 
careful tracings of continuous pitch fluctuations against a grid defined by the tones of 
the western musical scale. He was, of course, measuring absolute pitch, not the relative 
pitch that is the essence of tone languages, and he chose to simplify things by limit¬ 
ing his tracings to the pitch range of one person, his primary linguistic informant. 
The utterances he recorded ranged all the way from individual roots to entire folk 
tales, where overall intonation contours could be seen to carry the tonal fluctuations 
of the individual roots over an extensive absolute pitch range. Beach actually recorded 
and published much more tonetic material than he himself could analyze, recogniz¬ 
ing that the interplay of tone perturbations, intonation, and syntactic factors were all 
producing complexities that were beyond his powers of analysis with the primitive 
phonemic theory then available to him. He admittedly includes much of this extra 
information for the benefit of future generations of linguists to unravel with tech¬ 
niques yet to be invented and theories yet to be proposed. These complexities have 
finally been done justice in the recently published work of Haacke (1999). 

Beach’s examination of his tonetic data led him to propose that the tonemic system 
of Khoekhoe consisted of six contour tones which he mapped according to the tones of 
the musical scale. Presenting Beach’s contours according to the names of the note values 
at the end of the contours, we get the patterns in Table 1. 

Note that the contours all fell within the following sequence of pitches: f-f#-g- 
g#-a-a#-b-c'-c#'-d', comprising the comfortable speaking-pitch range of his primary 
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informant. The domain of these contours was not the syllable, as was the case with 
other known tone languages, but the root. The root in Khoekhoe was quite a well 
defined unit phonologically, and roots fell into five classes according to canoni¬ 
cal form: C, CV, CVV, CVN, and CVCV. The classes were numbered sequentially I 
through V, and only II through V could carry tonal contours. 

Kenneth Pike in his book Tone Languages (Pike 1948) was the first to express scepti¬ 
cism regarding Beach’s assigning of contours to roots rather than syllables, and he sus¬ 
pected that there was an alternate solution. The other logical possibility was that the 
contours were the result of the sequencing of two register tones, but the fact that Beach 
assigned them to roots of class II with the form CV meant that two register tones would 
have to be regularly found on single vowels—not impossible, but unlikely. 

As a result of my own analysis of the language (Hagman 1977), I found that many 
of the roots which Beach had classified in Class II (CV) were actually roots of Class 
III (CVV) where the two vowels happened to be identical. Beach had not thought of 
this because he had unfortunately rejected the length marks that had been placed on 
just these roots by earlier investigators, since he had tried his best to measure absolute 
vowel length and found he could not do so reliably. Once these roots are removed 
from Class II (CV), all that remains in this class are particles and suffixes, so that all 
true roots with tone in Khoekhoe then only have one of three forms, CVV, CVN, and 
CVCV, all three with two nuclei, each of which can carry a register tone. 

However, even knowing that all the roots are bipartite, it is still hard to see how 
there could be register tones when one looks at Beach’s tonetic tracings of the six con¬ 
tours. Anyway, if there were two registers there should be four contours, not six, and 
if there were three registers, there should be nine. Still, how the contours might be 
resolved into register tone sequences is not immediately apparent. 

Part of the solution is a reanalysis that becomes possible upon a close study of 
Beach’s methodology, which he scrupulously describes at every step. In his later work 
on the Korana language, a close relative of Khoekhoe, Beach describes a new elici¬ 
tation technique where he devised frames to include the word whose tone is being 
examined. His intent was to prevent the distortion caused by utterance-final contours 
(Beach 1938:243). It turns out that Khoekhoe, like Korana, has an intonation pattern 
where the pitch of the voice falls at the end of an utterance, a contour which automati¬ 
cally applies when words are uttered in isolation. The possibility then arises that per¬ 
haps register levels are obscured in the Khoekhoe tracings because the contours are 
distorted by this effect. 

When one goes back to Beach’s contours and raises each of the final pitches in the 
contours which do not already end in a lower tone, there appears a suggestion of three 
pitch levels between which the contours are moving: a low level on the note f, a mid 
level on the note g#, and a high level on the note c'. One need only extend some of the 
contours at each end to get the underlying tonal contours in Table 2 (overleaf). 

Tone number 1, high rising, is an allotone conditioned by a diphthong ending in 
a high vowel, so we will take 1a, less high rising, as basic. After making all the neces¬ 
sary alterations, it becomes clearly apparent that the contours are moving from one 
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1. high-rising 

c' 

> 

d' 





1a. less high-rising 

a# 

> 

c' 

becomes 

g# 

> 

c 

2. mid-rising 

g 

> 

b 

becomes 

f 

> 

c 

3. low-rising 

f 

> 

f# 

becomes 

f 

> 

g # 

4. high-falling 

c 

> 

b 

becomes 

c' 

> 

c 

5. mid-falling 

a# 

> 

g# 

becomes 

c' 

> 

g# 

6. low-mid level 

g# 

> 

g# 

becomes 

g# 

> 

g# 


Table 2. Derivation of underlying tonal contours. 

of three pitch levels in the first syllable to one of two in the second, resulting in the 
six combinations found by Beach. We will return later to the question of why the con¬ 
tours may appear to be shortened in Beach’s data. 

The presence of different numbers of registers in each syllable is still, however, a 
problem. Languages with register tones tend to have the same number of registers 
for all syllables. Interestingly, Beach himself solves this problem for us with the 
historical-comparative work he did using Khoekhoe and its sister language Korana 
(Beach 1938:247-53). Words beginning with the lowest tone in Khoekhoe tend to cor¬ 
respond with words beginning with voiced consonants in Korana, distinctive voicing 
not being present in Khoekhoe. He suggests that the lowest tone in Khoekhoe is the 
result of a relatively recent loss of voicing in initial consonants. Since medial consonants 
are always voiced in both languages, the same did not happen in the second syllable. 
Reanalysing Beach’s contours into register sequences, we can say that the lowest tone of 
the first syllable, an extra-low tone, was likely a replacement for a lost voicing distinc¬ 
tion. In acoustic terms, we would be looking at the replacement of the downward bend¬ 
ing of the first formant produced by a preceding voiced consonant with a lowering of 
fundamental vocal pitch, which would have a roughly similar perceptual effect. 

Beach never did a tonemic analysis of Khoekhoe’s particles and suffixes, but he knew 
they didn’t have contour tones. In my own work, I found three register tones to be pres¬ 
ent in these—not surprising since Korana has a voicing distinction in the consonants of 
these items too, which Khoekhoe has also apparently lost. It turns out that it is only the 
second syllable of a root where just two of the three possible register tones are found. It 
is thus now very simple to transcribe the language with markings for three tonal levels: 
acute accent for high, nothing for mid, and grave accent for low. 

Even if one accepts that the evidence accumulated since Beach’s time weighs heav¬ 
ily in favour of a register-tone interpretation of Khoekhoe tonology, one cannot help 
but remember that Beach diagrammed his contours as smooth curves; nowhere do we 
see the jumping between registers from syllable to syllable which we expect to find in 
a register tone language. However, when we read Beach closely, we begin to realize that 
his tracings and generalizations about contours apply primarily to the roots of Classes 
II through IV (CV, CVV, and CVN), and much less so to the less numerous roots of 
Class V (CVCV). These roots, which he always discusses last, are the only roots with 
clearly disyllabic structure. When we closely examine his tracings of longer texts we see 
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movements between more clearly defined pitch levels and not the smooth contours of 
the other roots. It now looks like the contouring effect found on most Khoekhoe roots 
may be explainable as a sort of tonetic ‘diphthongization’, analogous to the purely pho¬ 
netic form of diphthongization which occurs in Khoekhoe roots with vowel clusters, as 
I described in an earlier paper (Hagman 1995). A concomitant effect is the shortening 
of tonal glides in these roots as a side effect of a rapid articulation, an effect which we 
reversed earlier in our reanalysis when we extended the glides. 

In conclusion, the preceding reanalysis was only possible because of Beach’s 
extremely careful tonetic transcription and his scrupulous detailing of methodology. 
The transcriptions are observations of physical acoustic facts, not categorizations 
according to a given inventory of symbols, nor extrapolations of a preexisting theory 
of tone, nor impositions of any traditional view of language. We can be sure of this 
because at that time there were no symbols for tones, there was no general theory of 
tone, and there was no traditional view on the subject of tone to impose. By virtue 
of working in an area that linguistics had neglected, Beach had found himself doing 
some investigative work which can stand up to anyone’s definition of good science. 
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ONOMATOPOEIA MARKERS IN JAPANESE 


Ken-Ichi Kadooka 
Ryukoku University, Kyoto, Japan 


onomatopoeia in the lexicon of Japanese is both abundant and systematic. Japanese 
has a far richer and more systematic onomatopoeic stock than either Chinese or 
English, though not as great as Korean. The quantity is shown in the 1634 examples in 
Kakehi et al. (1996). I use the entries in this dictionary as a database for the following 
discussion. 

An inspection of the database reveals five morphological classes of onomatopoeia, 
as in (1): 


bare stem 

26 

hisi 

(to hug firmly) 

altered reduplication 

41 

gasa-goso 

(to look for something) 

doubled base 

45 

butu-kusa 

(voice/manner of grumbling) 

reduplication 

716 

koro-koro 

(something rolling) 

others 

806 

giiiQ 

(< base gu, jerking action) 


Only 26 entries, or 1.59% of the total, are bare stems. This suggests that actual ono- 
matopoetic usages are realized as a base with some morphological suffix(es). For 
example, altered reduplication includes a change of phoneme(s); in some examples, 
such as dota-bata ‘(state of extremely busy with something)’, the whole first syllable is 
different. In gasa-goso, the /a/ vowels in the base are replaced by /o/ in the reduplicant. 
No example of consonant alternation alone is found in the database. Doubled bases, 
on the other hand, are fairly common: e.g. suQteN-korori, ‘manner of falling down 
while walking or running’ > suQten ‘stumbling’ and korori ‘rolling’. 

Reduplication is the most characteristic onomatopoetic morphological class. 
‘Others’ include various forms, amounting to 806 items in the database. The major 
subdivisions are shown in section 3. 

In the following sections, we that entries in the Japanese onomatopoeia lexicon are 
characterized by morphological or phonological markers. We call those morphemes/ 
phonemes/features which are peculiar to Japanese onomatopoeia Onomatopoeia 
Markers (hereafter OMs), and describe the sound symbolic system of Japanese ono¬ 
matopoeia in more general terms in section 3.4. 

1. systematicity of Japanese onomatopoeia. In this section, we briefly review the 
organization of Japanese onomatopoetic lexical items. We assume that the systematic¬ 
ity originates in the derivational structure of the lexeme (base + suffixjes]) and that 
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this derivation is synchronic rather than historical. The most representative onomato- 
poetic base is disyllabic with canonic form as in (2). 

(2) /CjVj C 2 V 2 / 

Typically the first consonant is voiceless. These base forms are not themselves ono- 
matopoetic without OMs,but they can be reduplicated to become attested forms. 

The concept of OMs was suggested in Waida (1984). These are affixed to the base, 
forming actual onomatopoetic lexemes, as in (3). 

(3) base + OM = onomatopoetic lexeme 

OMs are usually suffixed to the base, symbolizing various aspects of the action or 
sound depicted by the base. Because bare stems are rare in the database, we can con¬ 
clude that OMs are generally necessary for the onomatopoeia lexicon of Japanese. 

OMs can be divided into two categories, moraic and segmental. Moraic OMs are 
more conspicuous because they add at least one mora to the base. The labels are tenta¬ 
tive, and segmental OMs have never been studied systematically. 

The five types of moraic OM’s are given in (4): 

(4) moraic nasal N 

moraic consonant Q: duplication of the following consonant 
[p t k s h] or a glottal stop 
vowel prolongation R 
ri 

Reduplication 

These OMs differ from one another with regard to phonological duration or length; 
N, Q, and R add only one mora to the base and do not constitute independent syl¬ 
lables; conversely ri adds one syllable to the base and Reduplication multiplies the 
number of syllables in the base. 

Segmental OMs modify a distinctive feature of some phoneme of the base. The three 
resulting modifications of this category are less visible morphologically than the modi¬ 
fications of moraic OMs. The classes of segmental OMs are summarized in (5). 

(5) Voicing 433 /koro/ —» /goro/ (rolling) 

Palatalization 56 /kata/ —» /katya/ (clitter-clatter) 

Spirantization 37 /kutya/ —» /kusya/ (messy) 

Moreover, there may be more than one segmental OM. For example, voicing may be 
affixed to the spirantized product of the original base: /kutya/ —» /kusya/ —» /gusya/ 
‘messy’. It is also possible to have three segmental OMs with Voicing, then Palataliza¬ 
tion then Spirantization: /heta/ -» /peta/ —» /petya/ —> /pesya/ ‘flat’. I discuss these 



ONOMATOPOEIA MARKERS IN JAPANESE 


269 


OMs, including their relative order and the reason for calling h —» p voicing, in sec¬ 
tion 3. 

2. moraic oms. In this section, we survey moraic OMs without attempting a detailed 
study of individual OMs. Each OM is attested in previous studies (e.g. Waida 1984 
Kadooka 1993), hence detailed investigation is omitted here. We mention only that 
there are hundreds of examples of each of the 5 OMs. 

Some of the forms on the base /koro/ are given in (6). Each depicts a manner of 


rolling. 



(6) 

*koro 

(the base does not occur alone) 


koroQ 

something rolls only once; the rolling action does not con¬ 
tinue for a long time 


koro-koro 

the thing rolls more than once 


koroN 

implies the completion of the rolling 


koroRQ 

the rolling object is large, suggested by the prolongation of the 
vowel R 


korori 

the thing rolls and then stops 


The notion of synchronic derivation is exemplified by the set of lexemes which have 
koro as a base. They all communicate something to do with rolling, but the manner of 
rolling differs from one to another, depending on the OM. To give a concrete example, 
a moraic stop Q signifies a short period of time, whereas the long vowel produced by 
R suggests either that the activity is extended in time or that the object is large. 

The non-occurrence of the bare stem *koro is typical of the lack of productivity of 
onomatopoetic roots without OMs. However, this stem is understood by native speakers 
as referring to the action of rolling. The form is regarded as unfinished without OMs. 

The form koroQ is often followed by the citation article to; hence the moraic con¬ 
sonant is realized as [t]. As mentioned above, when only this marker follows the base, 
a short action is suggested. 

Contrary to the short action expressed by koroQ, the reduplication koro-koro com¬ 
municates rolling repeated more than once. The base can be reduplicated more than 
once, resulting in a three- or four-fold OM: koro-koro-koro or koro-koro-koro-koro. 
Such improvisational repetitions are not listed in dictionaries. Then remind us of the 
vivid narrative style used in telling nursery tales to children. Naturally the length of 
the rolling action is proportionate to number of reduplications. Reduplication can be 
combined fairly freely with other OMs. 

The moraic nasal N contrasts with Q in communicating extended duration or 
large size. A larger object is pictured with N than with Q. At the same time, the 
completion of the action is also implied with N. At least part of the communicative 
contrast between N and Q can be ascribed to the phonetics. The greater sonority of N 
relative to Q reflects the greater vs. lesser duration or size iconically. 
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Two OMs, i.e. R followed by Q, appears in koroRQ. Of the two markers, the vowel 
prolongation R is predominant in mentioning the manner of rolling in that the longer 
duration is hinted as well as the slowness of rolling. The second marker Q is rather 
phonetic than symbolic in that case; a form without Q, i.e. *koroR-to is unstable. 

The OM ri is an individual syllable. Its origin may be related to the Middle Japa¬ 
nese auxiliary keri. No etymological commitment is made here. Ri communicates the 
termination of the action. In korori, something rolled and stopped. 

These five forms do not exhaust the derivations from the base koro. Though there 
are cooccurrence restrictions on some combinations of OMs (Q followed by R is 
phonetically impossible), moraic OMs can be combined rather freely, sometimes 
producing nonce formations. For example, kororiN-kororiN ‘(manner of something 
rolling intermittently)’ combines ri, N, and reduplication. This kind of productivity is 
unique to Japanese onomatopoeia. 

3. segmental oms. In this section, segmental OMs are discussed. These OMs differ 
from moraic OMs both in how they are applied and how the bases appear with them. 
These OMs are phonological in nature. Some segmental feature is different. This shift 
communicates the change of status of the thing/action/sound under description. 

3.1. voicing. We begin with Voicing. The voiceless obstruents of the bases /p t k s/ 
are voiced, producing /b d g z/, respectively. We return to the triad /h p b/ below and 
explain why the laryngeal fricative /h/ is paired with the labial stops /p b/. Take the 
base /koro/. The first stop of the base Ik/ is voiced to /g/, and the converted base /goro/ 
implies that larger things are rolling than in /koro/. 

Voiced obstruents are generally regarded as marked, relative to their unvoiced 
counterparts. Beside the iconicity between semantic and phonological markedness, 
however, there are other reasons for considering the voiced base marked. Take /koro/ 
again. This base portrays the manner of things rolling, ranging from smaller to rela¬ 
tively large things such as apples and oranges. The voiced counterpart /goro/, on the 
other hand, can only refer to large things. In this sense, the bases with voiced obstru¬ 
ents are marked, relative to semantically related bases with voiceless counterparts to 
those obstruents. Voiceless obstruents in Japanese are considered unmarked for other 
reasons than those adduced in our examination of onomatopoeia. One of these is in 
rendaku ‘sequential voicing’ (cf. Vance 1987, chapter 10). This voiceless-voiced contrast 
would be always judged voiceless unmarked not only in onomatopoeia in Japanese 
but also other lexicon in many of the languages of the world. Only one example from 
the various phenomena to consider voiceless [unmarked] is rendaku (sequential voic¬ 
ing) in Japanese (Vance 1987, Chapter 10). 

Voicing occurs in 433 instances in the database (cf. [5] in section 1). This nearly 
equals the number of occurrences of moraic OM’s. If we include the triad /h p b/ in 
this group, about 900 entries are involved voice OMs, i.e. the 433 with their unvoiced 
form and once again in their voiced form. This accumulation of data suggests the 
legitimacy of Voicing OMs. 
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It is generally the first consonant of a IC^V^C^VJ that is affected by voicing. But in 
a few entries, e.g. /doteN/ —> /dodeN/ ‘manner of falling down, the second consonant 
is voiced. C, is, of course, in a more prominent position and a shift in voicing is more 
obvious here. In this regard, C 2 plays only a subsidiary role. But in the case of /doteN/ 
Cj is already voiced. Voicing can only affect C 2 . 

There is also another type of exception to limiting Voicing to Cy Double Voicing. 
In some lexemes, voiceless C, and C 2 are voiced. Consider the base tyapo ‘splashing 
sound’. Moraic N is suffixed to /tyapo/, giving /tyapoN/. This form can be double¬ 
voiced, giving /dyaboN/. Double Voicing is presumably an Obligatory Contour Prin¬ 
ciple (OCP) violation. OCP prohibits two or more segments sharing a single feature 
within a given phonological domain. William J. Sullivan (personal communication) 
points out that the OCP is an artifact of a two-dimensional theory of phonology that 
does not permit a full exploitation of hierarchical (non-linear) relations. There is no 
question that /dyabo/ and forms like it occur naturally, so the implications for OCP 
need further consideration. 

Now consider the triad /h/-/p/-/b/. Articulatorily, /h/ is a laryngeal fricative and 
/p/ and /b/ are bilabial stops, hence there is a more than one-feature gap between /h/ 
and /p b/. Yet the three consonants are treated as related in modern Japanese orthog¬ 
raphy, reflecting the historical change /p/ —> lil —> /h/, such that /h/ is unmarked, 
while lb/ is marked and /p/ most marked. Orthographic markedness is signaled 
in the kana syllabary: those syllables having lb/ onset are indicated with two dots 
superscript and those having /p/ onset with a superscript circle. The dots are used in 
other voiced obstruents Id g z/, but the circle is only for /p/ onsets. This also suggests 
a markedness order of /h/ > Ibl > /p/. 

But a different markedness order is given by the onomatopoeia series as Ibl > 
Ip I > Ibl, as in (7). 

(7) /hata-hata/: the sound of something like flag fluttering in the wind 
/pata-pata/: the wind is stronger than in /hata-hata/ 

/bata-bata/: the wind is stronger than in /pata-pata/ 

Dozens of the entries with the onset triad /h p b/ can be found in both sound 
and manner mimesis, proving the appropriateness of the derivational relationship 
among the consonants. 

3.2. palatalization. The second segmental OM is Palatalization, which converts each 
of the consonants /pbtdkgszhn/ into a palatalized counterpart. It is impossible to 
find a common articulatory feature defining these sounds as a natural class. Worse, the 
[coronal] /r/ and the [labial] /m/ are excluded from the list. At the moment it seems 
impossible to give this set of consnants a neat phonetic description. Palatalization 
applies to 56 items, versus 433 for Voicing. We return to this gap in section 3.4. 

Palatalization occurs before the back vowels /u o a/, since consonants are auto¬ 
matically palatalized before the front vowels /i e/. Under palatalization, for example, 
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/beto-beto/ ‘sticky’ becomes /betyo-betyo/ [beco-beco] with palatalization of the It/. 
Beto-beto communicates the idea of stickiness, such as with oil on the hand. Betyo- 
betyo adds a touch of discomfort. Palatalization does not necessarily entail a bad con¬ 
notation. For example, /suru/ —> /syuru/, both mimicking the rustling sound of cloth, 
implies nothing wrong. 

Needless to say, the plain root is unmarked and the palatalized counterpart is 
marked. This parallels the relative markedness of plain to palatalized consonants. Pal¬ 
atalization is marked relative to voicing, because it has a broader area of application. 
That is, both unvoiced and voiced consonants can be palatalized, as can consonants 
like nasals, for which voice is not phonemic. In the onomatopoeia base IC^V^C^VJ, C, 
is more likely to be palatalized, but this is far from general. Many of the palatalized 
C s communicate a splashing or dripping sound, but other possibilities are included 
in (8). 


/kata/ 

-» 

/katya/ 

clitter-clatter 

/nu/ 

—> 

/nyu/ 

to appear suddenly 

/pota/ 

—> 

/potya/ 

dripping sound of water 

/tapo/ 

—> 

/tyapo/ 

splashing sound of water 

/zyori/ 

-A 

/zori/ 

roughness of the surface of human skin 


The entries in (8) are typical of palatalized items. But it is difficult to give a unified 
semantic description for explanation for all of them. At best, something like ‘roughness’ 
or ‘discontinuity’ is the most general semantic feature for these palatalized entries. 

In contrast to Double Voicing, double palatalization is impossible. Schourup and 
Tamori (1992:123) suggest that this is due to articulatory difficulty: if both and C, 
are palatalized, it is tongue twisting for a native speaker of Japanese, though there 
is no such restriction on Slavic languages. Yet this is one more piece of evidence to 
assume that voicing is articulatorily easier than palatalization in Japanese onomato¬ 
poeia phonology. 

There is one interesting pair of bases belonging to the group of dripping sound of 
water: /pota/ and /tapo/. The latter is interchangeable with /tapu/ by alternation of the 
second vowel. Thus /tapo/ and /tapu/ may both be distinct entries, giving the same 
meaning with two different syllables. This suggests that /p/ and It/ are equivalent in 
symbolizing the dripping sound. If C ] is predominant in the Japanese onomatopoeia 
lexicon, this alternation is a rare case. 

3.3. spirantization. The last alternation is spirantization or sibilantization. Here 
a stop It/ becomes a sibilant Is/, often in palatalized syllables. Here It/ is regarded 
phonologically unmarked because the plosive phonemes (/p t k b d g/) outnumber 
those of fricatives (Is z h/) in the Japanese consonant system. Hence it would be more 
natural to postulate that a sibilant Is/ derives from It/ rather than the opposite. 

The next question is why Spirantization applies only to the voiceless stop It/ and not 
its voiced counterpart Id/. There are phonological reasons for this. First, Spirantization 



ONOMATOPOEIA MARKERS IN JAPANESE 


273 


generally follows palatalization. The contrast between the voiceless /sy/ and /ty/ is main¬ 
tained and can be neutralized in spirantization. But the contrast between the voiced 
/dy/ and /zy/ is neutralized. There is nothing left for spirantization to neutralize. 

Spirantization is the most marked of the three Segmental OM’s. It has even fewer 
occurrences still than Palatalization: only 37 examples are attested in the database. 

Regarding combinations of segmental onomatopoeia, there is an interesting 
example concerned with the sound of splashing: the palatalized C 2 of /patya/ is 
spirantized to /pasya/; each of these can get a voicing OM on C,, giving /batya/ and 
/basya/, respectively. 

Spirantization is restricted almost exclusively to C 2 of the onomatopoeia root. The 
sole exception is /syaN/ (see below). Its range of application is limited to It/ in C, posi¬ 
tion. This tells us why spirantization is so rare. 

As mentioned above, the only case of the Spirantization OM not applied to C 2 is 
the monosyllabic base /syaN/, derived from /tyaN/ ‘neatly, consciously’. There are no 
instances of Spirantization on C } of a IC^fl^TJ base. 

Among the 37 entries with the Spirantization OM, only three lack the Palataliza¬ 
tion OM as defined in in 3.2. above. Examples of these are given in (9). 


(9) /mutuQ/ —» /musuQ/ in a bad mood 

/gaQtiri/ —» /gaQSiri/ the body strongly built 

/gatiQ/ —» /gasiQ/ the body strongly built 

Of these three forms, the /si/ and /ti/ sounds are automatically palatalized phoneti¬ 
cally, and they cant be further palatalized. Hence, the pure exception to Palatalization 
as an OM is only /musuQ/. If palatalized, the spirantized form would be VmusyuQ/ 
which would be derived from */mutyuQ/. This would lead to the conclustion that 
Spirantization should obligatorily cooccur with Palatalization. 

Other instances of Spirantization are given in (10): 


(10) /kutya/ —> 

/gutya/ —> 

/bityo/ —> 

/dota / —» 

/petaN-ko/ —> 


/kusya/ 

/gusya/ 

/bisyo/ 

(/dosya/) —> 
/petyaN-ko/ -» 


/dotya/ 

/pesyaN-ko/ 


messy 

cloth or paper crumpled 
wet, soaked to the skin 
sound of falling down 
flat 


It seems to me that those forms with non-spirantized It/ in the bases entail a somewhat 
sticking tone while the spirantized counterparts with /s/ do not necessarily, with the 
latter sounding a little noisy. This may come from the acoustic property that the sibilant 
/s/ involves greater energy than /t/, and that it can last longer than the plosive It/. 

Some of the examples in (10) can be traced back to non-spirantized non-palatalized 
bases. So /pesyaN-ko/ goes back to non-spirantized /petyaN-ko/, which in turn comes 
from non-palatalized /petaN-ko/. All of these three forms are in the database. These 
derivational relations illustrate the relative order of Palatalization and Spirantization. 
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Another instance is /dosya/ derived from /dota/, but with an unattested intermedi¬ 
ate form /dotya/. The others don’t have non-palatalized forms. Non-palatalized /kuta/ 
exists, but it has the meaning ‘exhausted’. The same is true of /guta/ and bita/. A full 
investigation of the interaction of Palatalization and Spirantization is still needed. 

3.4. summary. The three segmental OMs are arranged with regard to the frequency of 
occurrence in (11), together with the inventory of those consonant phonemes which 
are subject to each process. 

(11) Voicing (433) Palatalization (56) Spirantization (37) 

/p t k s h/ > /p t k b d g s z h n/ > It/ 

The order in (11) reflects the relative importance of each OM. Voicing is clearly 
indispensable for the onomatopoeia lexicon in Japanese, both because of the pairs 
of voiceless-voiced consonants in the sound symbolic system and the proportion of 
obstruents in the realization of morphemes. Palatalization and Spirantization are 
much less prominent in the system of segmental OMs. 

Another view of the segmental OM system is given by the numbers of the pho¬ 
nemes that participate in each process. Five phonemes are subject to Voicing, ten 
to Palatalization, but only one to Spirantization. This suggests that Palatalization is 
a more important OM than Voicing. Yet Voicing occurs about 7.5 times more than 
Palatalization. If we include the voiced obstruents /b d g z/ as participating in Voic¬ 
ing, the number of the phonemes related to Voicing is nine, which approaches that of 
Palatalization. But the fact that Voicing OMs occur 7.5 times as often as Palatalization 
OMs in spite of the fact that there are fewer potential targest for Voicing underlines 
the significance of the Voicing process in sound symbolism. By either measurement, 
however, Spirantization is the least significant of the three. 

4. conclusion. The present study, though still preliminary, has a number of advan¬ 
tages. First, by noting the various OMs, we can provide insight to large numbers of 
semantically parallel but otherwise unrelated entries. Second, It also provides regular¬ 
ity to the phonology, resulting in a certain amount of phonosymbolic iconicity. Third, 
it suggests that the onomatopoeia lexicon of Japanese should be organized on the 
basis of the root morpheme. 

Similar approaches to onomatopoetic vocabulary may be taken with languages 
that have a much less systematic set of morphological patterns than Japanese, like 
English (Kadooka 1995) or Beijing Chinese (Kadooka 2001), and with languages that 
have a much more systematic set of morphological patterns, like Korean (Noma 2001). 
The typology of onomatopoetic vocabulary in many more languages must be studied 
before we can make general statements about the phonological patterns used. 
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this paper presents a discourse analysis of two parables in the New Testament: 
the parable of the workers in the vineyard (Matthew 20:1-16) and the wedding feast 
parable (Matthew 22:1-14) b 

A parable is a small, embedded discourse that is part of a much larger discourse. 
Parables contain some of the most recognized stories in the Bible, but they are also 
some of the most misunderstood and misinterpreted stories 2 . Each parable has a 
specific audience and a set of points by which the audience is drawn into the par¬ 
able and enticed to action. A hearer is to understand different points of reference 
in a parable, identify himself with one point, and respond to the story’s unexpected 
turn. Subgroups in the audience may assume different points of reference and react 
differently to the parable. Fee and Stuart (1982:126) suggest that Jesus’ main purpose 
in telling a parable was ‘calling forth a response on the part of the hearer’. The key to 
understanding the parables is to ‘hear’ them in the same way that the original audi¬ 
ence would have heard them. Thus it is important to ascertain who the audience was 
for the parable and then to figure out what the points of reference are that would 
‘catch’ that audience and cause them to respond in some way. 

The two parables are each analyzed at the discourse level and divided into macro¬ 
segments of an aperture (a formulaic opening), prepeak and peak episodes, and a 
closure. A peak episode is ‘a zone of turbulence in regard to the flow of the discourse 
in its preceding and following parts’ (Longacre 1996:38). The parables are also ana¬ 
lyzed at the paragraph level, with each macrosegment divided into sentences and 
embedded paragraphs. The two parables, although different in internal paragraph 
structures, are similar in overall discourse structure reflecting a narrative schema, but 
with hortatory intent as related to the larger discourse. The audience of each parable 
is identified, and the subgroups are related to points of reference. The analysis of the 
parables tries to capture the Koine Greek structure, but literal translations in English 
are used for ease of presentation. 

1. THE PARABLE OF THE WORKERS IN THE VINEYARD (MATT. 20:1-16). The parable 

starts with the formulaic expression, ‘The kingdom of heaven is like’, in v. 1a, following 
the well-established pattern of introducing a series of parables Jesus used to enhance 
our understanding of the kingdom of heaven. Verse 16, ‘Thus will be the last ones first 
and the first last’, is outside the story proper and is considered the closure contain¬ 
ing the moral. The rest of the material in the body of the story can be divided into 
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two episodes at the discourse level: the hiring process spanning almost the whole 
day (w. lb—7), and the paying process (w. 8-15) initiated by a conjunction (de) and a 
participial clause ‘ [when] evening having come’. The first episode recounts hiring five 
groups of workers at different times: early in the morning and the third, sixth, ninth, 
and eleventh hours (or 9 a.m., noon, 3 and 5 p.m.). The second episode provides the 
paying process, starting from those hired last going on to the first. It is here that some¬ 
thing unexpected occurs: they are all paid the same wage of a denarius—forming the 
climax of the story On the surface level of the morphosyntax, unusual features occur, 
marking the episode as peak. The discourse structure of the parable is presented in (1), 
with notional narrative schema slots in parentheses (the aperture is considered only a 
surface slot with no notional correlate): 


(1) Aperture: 20:1a 

Prepeak Episode (Inciting Incident): 2o:ib-7 
Peak Episode (Climax): 20:8-15 

Closure (Moral): 20:16 


The prepeak episode is expounded by a sequence paragraph, as shown in the indenta¬ 
tion diagram in (2) 3 . The first Sequential Thesis (ST) is filled by a simple paragraph 
with Setting in v. lb, with an introduction of the main participant, the landowner, 
and the first action of the parable—in a relative clause—that the landowner goes out 
to hire workers. The Thesis in v. 2 begins with a participial clause, in which the land- 
owner makes the agreement for payment with the workers, and the main clause verb 
‘sent’ is in aorist, the first grammatically marked mainline action. 

(2) Prepeak Episode: Sequence 5 (Matt. 2o:ib-7) 

ST 1: Simple 5 

Setting :1 b a landowner who went out early in the morning to hire 
workers for his vineyard. 

Thesis: 2 And having agreed with the workers for a denarius for the 
day, (he) sent them into his vineyard. 

ST2: Simple Dialogue f 

Lead-In: 3 And having gone out around the third hour (he) saw 
others having stood in the marketplace idle. 

IU (Proposal): 4 And to them (he) said,‘Go also you into the vine¬ 
yard, and whatever may be considered right (I) will 
give to you.’ 

RU (Nonverbal Response): 5a And they left. 

ST3: 5b And again having gone out around the sixth and the ninth hour 
(he) did likewise. 

ST4: Compound Dialogue 5 

Lead-In: 6 a And around the eleventh hour having gone out, (he) 
found others having stood 
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Exchange 1: Simple Dialogue f 

IU (Question): 6 b and (he) SAYS to them: ‘Why have (you) been 
standing here all day idle?’ 

RU (Answer): 7 a (They) SAY to him,‘Because no one hired us.’ 

Exchange 2: Unresolved Simple Dialogue f 

IU (Proposal): 7 b (He) SAYS to them,‘Go also you into the vine¬ 
yard.’ 

Each ST corresponds to a group of workers hired at a different hour, except that ST 3 
includes two groups in v. 5. The temporal progression corresponds to the numbers of 
ST’s, from ST 1 to ST 4 . The amount of detail given for each group shows the impor¬ 
tance of each group to the plot. Each time the landowner goes out, the same event 
takes place. ST 1 and ST 2 contain quite a bit of information so that the listeners would 
know what was going on. ST 3 , however, contains very little information; these work¬ 
ers are not as important to the story. It summarily handles two different groups of 
workers by the use of‘again and ‘likewise’. Then in ST 4 , there is a lot of detail and a 
dialogue with the last group of workers who were hired at the eleventh hour. Note also 
after the stream of aorists in narration of ST 1 - 3 , the historical present is used here 
for the speech verb (legei ‘says.he’) in all three occurrences of w. 6-7 (cf. Levinsohn 
2000; Longacre 1999). These contrast with the eipen ‘said.he’ used in v. 4. All these fea¬ 
tures—the detail, dialogue, and unusual tense—draw more attention to the workers 
sent out at the eleventh hour because of their importance to the parable. Within the 
prepeak episode, ST 4 certainly is climactic. 

The peak episode (w. 8-15) recounts the paying process. Verse 8 provides the Set¬ 
ting for the two ST’s (see diagram 3), with a direct speech by the owner to the fore¬ 
man to give the wage. Immediately, the audience is drawn in and the tension mounts 
as they try to figure out how much each group will get paid. The last group receives a 
denarius each in ST 1 , and the middle groups are not mentioned. Then ST 2 (w. 10-15) 
deals with the interaction between the owner and the first group of workers. They get 
the same amount! They are quite upset. It is here that we have reached the very zone 
of turbulence within the peak episode. 

(3) Peak Episode: Sequence 5 (Matt. 20:8-15) 

Setting: 8 And evening having come, SAYS the owner of the vineyard to 
his foreman,‘Call the workers and give them the wage, begin¬ 
ning with the last ones until the first ones.’ 

ST1: 9 And those having come around the eleventh hour received each a 
denarius. 

ST2: Complex Dialogue 5 

Lead-In: 10 And the ones having come first thought that a larger 
sum (they) would receive, but (they) received each a 
denarius also themselves. 

IU (Remark): Quote f 
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Quote Formula: 11 And having received (it), (they) were complain¬ 
ing against the landowner saying, 

Quote: 12 ‘These last worked (only) one hour, and you made them 
equal to us who have endured the burden of the day and 
the heat.’ 

CU (CounterRemark): Quote y 

Quote Formula: 13a But he, having answered one of them, said. 
Quote: Contrast f 
Thesis: Reason f 

Reason: Reason y 

Thesis: 13b ‘Friend, (I) am not treating you unjustly. 

Reason: 13 c Didn’t (you) agree with me for a denarius? 

Thesis: 14 a Take what is yours and go. 

Antithesis: Comment y 

Thesis: 14b But (I) wish to give to this last the same as (I 
gave) you. 

Comment: Alternative y 

Alternative Thesis 1: 15 a Or am I not allowed to do what 
(I) wish with what is mine? 

Alternative Thesis 2: 15b Or are you envious (lit.: is your 
eye evil) because I am good?” 

There are several features in w. 8-15 that help to mark the peak: participant reference, 
tense change, long relative clause, rhetorical question, and crowded stage. Throughout 
most of the parable, the subject is usually only marked in the verb, except when the 
participant is first introduced or when an explicit reference is needed for disambigu¬ 
ation, e.g., a switch of subject. Thus the landowner is introduced after the verb in v. 
lb but is not overtly referred to again in the prepeak episode. At the beginning of the 
peak episode in v. 8, when evening came, he is called ‘the owner of the vineyard’. This 
explicit reference in a noun phrase, even when there is no change of subject (between 
v. 7 and v. 8), helps to mark the episode boundary. Later in v. 13a, an explicit (articular) 
pronoun ho ‘he’ is used in addition to the usual agreement in the verb ‘said.he’. This 
use of the pronoun with de ‘but’ is for a switch of speakers in dialogue and adds an 
emphasis to the landowner. 

Tense is mostly aorist throughout the parable, but the verb legei ‘says.he’ in v. 8 
is in historical present (as in w. 6-7, as noted above). In addition, there occurs one 
verb in imperfect in v. 11, where the first group grumbles about the pay (they do it 
continuously). It is the only imperfect in an independent clause in the parable. Their 
complaint in the direct quotation in v. 12 also shows turbulence at peak by the long 
modification in a relative clause at the end: ‘to us who have endured the burden of the 
day and the heat’. Note that it does not give any new information but serves the func¬ 
tion of rhetorical underlining. The long speech by the owner in w. 13-15 is a marked 
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feature as well, including three rhetorical questions. This section also contains a 
crowded stage; the landowner, his foreman, and all the workers are there on stage. 

While the grammatical structure of the parable is presented in (1)—(3), there is a 
clear chronological frame in chiastic structure according to the general content and 
lexical items. The workers that are hired at different times are paid in reverse order. 
Even the order of the first and the last in the thematic statements is reversed between 
the sentences in 19:30 and 2o:i6 4 . Both grammatical and chiastic principles of unity 
and cohesion of the text are at work here. The two principles are often quite indepen¬ 
dent of each other, but contribute to the unity of the whole each in its own way. 

(4) 19:30 Many who are first will be last, the last first. 

20:1-2 Hiring workers early in the morning 
20:3-53 Hiring at third hour 

20:5b Hiring at sixth and ninth hours 
20:6-7 Hiring at eleventh hour 
20:8 Instruction for payment 
20:9 Paying those hired eleventh hour 
(no mention: Paying those hired ninth, sixth, and third hours) 
20:10-15 Paying those hired first 
20:16 The last ones will be first, and the first ones last. 

The use of tense is somewhat chiastic as well: the historical present with speech verbs 
in the innermost structure (w. 6-7 and v. 8), the usual aorist for mainline verbs in 
the rest of the body (w. 1-5 and w. 9-13), and the future for moral statements at the 
outermost (19:30 and 20:16). 

As for the audience of the parable of the workers, the larger context shows that 
Jesus has been speaking to the disciples earlier (Matt. 19:23), and that is probably what 
is happening here. Matthew 20:1 has the conjunction gar ‘for’, making an explicit con¬ 
nection to the previous section, which ends with the moral in Matt. 19:30. 

The points of reference in the parable of the workers would be the landowner, the 
first group of workers, and the last group. The other groups of workers are not given as 
much detail and do not appear at peak. Grammatically, the landowner is given direct 
participant tracking as the subject. The workers hired last engage in dialogue exchanges 
with the landowner at the hiring stage; it is the only group to have such prominence in 
prepeak episode. At peak, however, they only serve as a foil to the highlighted full-day 
workers, which is now the only group to have a dialogue exchange with the landowner 
at the paying stage. Anyone listening who would identify with the first group of workers 
in assuming unfairness on the part of the landowner would find themselves embar¬ 
rassed by the landowner’s reply to their objections. The intended audience is thus those 
who would‘identify with the full-day laborers, since they are the focus at the end’ (Fee & 
Stuart 1982:130). We are likely to identify with the first group, but the message shows that 
God’s mercy is equal for all and that we are all to be equally grateful 5 . 
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2. the parable of the wedding feast (Matt. 22:1-14). This parable opens with a 
speech quote formula where Jesus is speaking to a multitude and some Pharisees: ‘And 
having answered, Jesus again spoke to them in parables, saying’ (v. 1). This is basically 
the introduction that a parable is coming and has a function only in the larger dis¬ 
course. The parable itself is the quote, which is expounded by an embedded narrative 
discourse. It starts with almost the same aperture as in the parable of the workers: ‘the 
kingdom of the heavens is like’. The main body of the parable is similarly divided into 
two episodes, closing with the same kind of moral statement. The prepeak episode 
itself is expounded by an embedded narrative discourse with its own two episodes, 
prepeak(em) and peak(em) 6 .The overall structure of Matthew 22:1-14 appears in (5). 


( 5 ) 


Quote Formula: 

Quote: 

Aperture: 

Prepeak Episode (Inciting Incident): 
Prepeak(em) Episode: 

Peak(em) Episode: 

Peak Episode (Climax): 

Closure (Moral): 


22:1 

Narrative Discourse 
22:2a 

Narrative Discourse 
22:2b-7 
22:8-10 
22:11-13 
22:14 


The prepeak(em) episode (see diagram 6) is expounded by a sequence paragraph in 
which the main participant is introduced in Setting with a relative clause: ‘a king who 
prepared a wedding feast for his son. ST 1 contains the first eventline verb where the 
king sends out his servants to call the ones who were invited to the feast. But they do 
not want to come. ST 2 begins with the word palin ‘again’, which is an introducer that 
shows a repeated action. More details are given this time, with direct speech. Once 
more, the king sends out his servants to invite the guests. What the king wants them 
to say is given in a direct quotation. The responses this time are more detailed and are 
broken into three groups. Instead of coming to the feast, the first two went off—one 
to his own field and the other to his business. The others seized the king’s servants, 
insulting and killing them. 

The king’s response to their deeds is found in ST 3 (v. 7). The king becomes angry, 
sends out his army, destroys the murderers, and burns their town. The emphasis is on 
the actions of the king. The sending of the armies is in a preposed participial clause. The 
verb ‘destroyed’ is in third person singular as if the king was doing the destroying. 
The armies are the instrument by which the king performs the action of destroying and 
burning. The last two actions are stated in contrastive VO kai OV orders: ‘he.destroyed 
those murderers and their city he.burned’. 

At this point in the story, the king is in a quandary: What is he to do for guests for the 
wedding? It is here the peak(em) starts. The king instructs his servants to go out into 
the streets and invite everyone. The servants go out and bring both good and bad people 
for the feast, filling the wedding chamber with guests. Thus peak(em) is expounded by 
an execution paragraph: the king’s plan is carried out (executed) by his servants. 
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(6) Prepeak Episode: (Embedded) Narrative Discourse (Matt. 22:2b-io) 
Prepeak(em) Episode: Sequence J 

Setting: 2 b a king who prepared a wedding feast for his son. 

ST1: 3 And (he) sent out his servants to call those invited to the 
wedding feast, but (they) wished not to come. 

ST2: Counterexpectation/Frustration f 
Thesis: 4 Again, (he) sent out other servants saying,‘Tell those 

invited: “Behold, my banquet (I) have prepared, my oxen 
and fat calves having been killed; and all things (are) 
ready, come to the wedding feast.’” 

Surrogate Thesis r. 5 But they, having paid no attention, left, one to 
his field, another to his business. 

Surrogate Thesis 2: 6 And the others, seizing his servants, mis¬ 
treated and killed (them). 

ST3: 7 And the king was angry, and having sent his army, (he) 
destroyed those murderers, and their city (he) burned. 
Peak(em) Episode: Execution f 

Plan: 8 Then (he) SAYS to his servants,‘Indeed the feast is ready, but 
those invited were not worthy. 9 Go therefore to the main 
streets, and as many as (you) find there, invite to the feast.’ 

Execution: 10 And having gone out into the streets, his servants 
gathered all the people (they) found, both bad and 
good, and the wedding chamber was filled with 
guests. 

The seam of the two episodes within the embedded narrative is marked slightly dif¬ 
ferently from that of the parable of the workers. Here, while tote ‘then occurs initially 
in the sentence and episode, there is no explicit reference to the king as there is to the 
landowner in Matt. 20:8. The king is the continuing agent-subject from v. 7, although 
he interacts with different groups, first with the murderers and then with his servants. 
In the whole parable, there are four explicit references to the king in a noun phrase, 
the first at the first introduction (v. 2), and the rest when there is a switch of subject 
across sentences (w. 7,11,13). All these occurrences are required to make the partici¬ 
pant reference clear. 

The peak episode (w. 11-13, in diagram 7) centers around the man with no wed¬ 
ding garment, throwing an interesting twist to the story. It is unexpected enough that 
the king becomes so generous as to invite even bad people from the streets to the 
wedding feast. But now that these people from the streets are at the feast, the king 
comes in and notices that one of them is not dressed properly. The king has him 
thrown out into the outer darkness 7 . 

This episode constitutes a climax reflecting the highest tension. It is filled by a 
compound dialogue paragraph. After the Lead-In sentence, Exchanges between the 
king and two separate groups follow. The first exchange consists of his (rhetorical?) 
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Question to the man who has no answer. The second occurs with his servants: his 
instruction to throw the man out (Proposal), to be followed by the unstated Response, 
yielding an unresolved simple dialogue paragraph. 

(7) Peak Episode: Compound Dialogue 5 (Matt. 22:11-13) 

Lead-In: 11 But the king, having come to see the guests, saw a man not 
wearing wedding clothes. 

Exchange 1 : Complex Dialogue 5 

IU (Question): 12 a And (he) SAYS to him,‘Friend, how did (you) 
enter here not having wedding clothes?’ 

CU (Nonverbal reaction): 12b But he (the man) was speechless. 
Exchange 2 : Unresolved Simple Dialogue 5 

IU (Proposal): 13 Then the king said to the servants,‘Having bound 
his feet and hands, throw him into the outer dark¬ 
ness; there will be wailing and gnashing of teeth.’ 

The peak of this story is not as well marked in the surface structure as the other par¬ 
able. There is, for example, no long stretch of direct speech with rhetorical questions, 
but there are a few things that provide clues. First of all, the speech verb legei ‘say’ in 
this episode (v. 12a) is in the present tense, while the final one in v. 13 is in the usual 
aorist. The use of the historical present makes this section more vivid. An articular 
pronoun ‘he’ is used with de ‘but’ in v. 12b for a switch of reference, adding an emphasis 
to the man and his reaction. In some sense, this peak episode represents a crowded 
stage as well, having all the participants on stage in the background but focusing on 
the king and the man. 

The final verse (v. 14) constitutes the moral: ‘For many are called but few chosen.’ 
This statement is much better understood after having heard the parable than if heard 
on its own. Notice that the didactic statement of both parables is a kind of commentary 
or evaluation by Jesus. Within the parables, Jesus does not interject commentaries or 
explanations as to meaning. He merely tells the story and then gives the main teaching 
point that is expected to bring about action by the listeners. 

The audience of the wedding feast parable is fairly obvious. Jesus is speaking to the 
chief priests and the Pharisees (Matt. 21:45), with a large multitude listening as well. 
The king, the group who rejected their invitation, the group invited from the streets, 
and the man not dressed properly seem to be the points of reference. The king does 
most of the eventline action in the story, while those invited from the streets and 
the man play a very important role. Those who rejected the invitation play a crucial 
role in building the tension of the story, but are not referred to again after v. 7. The 
Pharisees might relate to the point of reference of those who are originally invited but 
do not come. In listening to the story the Pharisees would get caught when the king 
destroys them and then invites both good and bad people into the wedding feast. But 
this is not the place where the multitude who are also listening are drawn in. They are 
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probably rejoicing that the king has invited all the average people. But they get caught 
when one of these people gets thrown out because he is not dressed appropriately. 

3. conclusion. This paper has analyzed the discourse structure of two parables to see 
how discourse features are grammatically marked. While the embedded paragraph 
structures within the episodes differ considerably, from compound and complex dia¬ 
logue paragraphs to sequence and execution paragraphs, there is an overall similarity 
in discourse structure. Both stories start with an aperture and close with a moral, with 
prepeak and peak episodes in the body. The differences tend to be in the details at the 
paragraph level as to how sentences and embedded paragraphs relate to each other. 
The differences in points of reference are a result of Jesus directing his parables to 
specific audiences through points at which they can relate to the story. 

Below is a comparison of the surface structures from the two parables as we ana¬ 
lyzed them. 



Matt. 20 : 1-16 

Matt. 22 : 2-14 

Aperture: 

1a 

2a 

Prepeak Ep: 

Sequence f 

Narrative Discourse 


ST 1 : ib-2 

Prepeak(em) Ep: Sequence 5 


ST 2 : 3-5a 

Setting: 2b 


ST 3 : 5 b 

ST 1 : 3 


ST 4 : 6-7 

ST 2 : 4-6 


ST 3 


Peak(em) Ep: Execution 5 
Plan: 8-9 

Execution: 10 


Peak Ep: Sequence f 

Compound Dialogue f 

Setting: 8 

Lead-In: 11 

ST 1 : 9 

Exchange 1: 12 

ST 2 : 10-15 

Exchange 2: 13 

Closure (Moral): 16 

14 


To briefly refer to the conference theme, what counts as legitimate evidence in 
linguistics, we argue for the use of recurring patterns in naturally occurring data. 
The data-based approach is fruitful, especially when we analyze a unit larger than the 
sentence. Native speakers are rarely conscious of discourse-level features. Notice, for 
example, the use of explicit reference (‘the owner of the vineyard’) at the beginning of 
a new episode although there is no subject switch. Note also the shift in tense and the 
use of rhetorical questions at peak. Such variation is likely to be ‘intuitively’ labeled 
just‘optional’ at the sentence level, but each variety has a unique function in discourse. 
When we analyze a language such as Koine Greek with only written data and no 
native speakers, the use of textual data may be the only approach available. 
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1 We express our gratitude to Marlin Leaders, David Robinson, and Bruce Turnbull for their 
comments on the earlier versions of the paper. 

2 Fee and Stuart (1982:123) point out that ‘the parables have suffered a fate of misinterpreta¬ 
tion in the church second only to the Revelation. 

3 The readers are referred to Longacre (1996, chapters 4 and 5) for a detailed discussion 
of paragraph and dialogue types. The abbreviations used here follow his convention: 
ST=Sequential Thesis, IU=Initiating Utterance, RU=Resolving Utterance, CU=Continuing 
Utterance. The data are given in literal translations in English to reflect the Koine Greek 
structure. The aorist tense/aspect, which usually marks the eventline of narrative, is indi¬ 
cated in boldface, the present in capital letters, and the imperfect in italics. The tense is 
indicated for main clause verbs and those in relative clauses. The implied information 
needed in English is shown in parentheses. 

4 Matthew 19:30, ‘But many who are first will be last, and the last first’, is the final verse of the 
previous chapter, and 20:16 is the closure (moral) of the parable as outlined in (1). Both 
19:30 and 20:16 are the thematic statements for the parable, bracketing the story. 

5 The closure further points out that those who were first may be last since they are not as 
grateful as those who were originally last but are now first, being filled with gratitude for 
God’s abundant grace. 

6 To avoid the confusion, the embedded discourse episodes are marked by (em), e.g., 
peak(em). 

7 We believe that our analysis as shown in diagrams 5-7 is ‘plausible’. We have combined w. 
2b-7 and w. 8-10 as two episodes in the same embedded discourse dealing with God’s invi¬ 
tation for all of us to his kingdom. A short peak episode follows in w. 11-13, with the theme 
of judgment, that we need to wear special garments (i.e., we need to be changed and reborn) 
to respond to his invitation. Alternative analyses, however, may be possible, e.g., combining 
w. 8-10 and w. 11-13 together as a sequence of events occurring on the same day. This, we 
believe, is a less preferred approach to text analysis since chronological time is in general sec¬ 
ondary to grammatical criteria. Often the author chooses to bundle up events in a sentence 
or paragraph in which a considerable time lapse is involved. See, for example, v. 7, which 
states in one sentence a series of events that would have taken time, for the king to send his 
army, kill the murderers, and burn their city. 
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WHAT CONSTITUTES ACOUSTIC EVIDENCE OF PROSODY? 
THE USE OF LINEAR PREDICTIVE CODING RESIDUAL SIGNAL 
IN PERCEPTUAL LANGUAGE IDENTIFICATION 
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introduction. Language Identification (lid) is a process for identifying a language 
used in speech 1 . The cues for identifying languages are classified into two types: 
segmental and prosodic. In the field of automatic lid by computers, much of the 
research so far has focused on utilizing segmental features contained in the speech 
signal (Muthusamy et al. 1994), although some research also suggests the importance 
of incorporating prosodic information into the system (Thyme-Gobbel & Hutchins 
1996; Itahashi et al. 1999; etc.). In contrast to this engineering research scene, most of 
the research on perceptual lid by humans has focused on prosodic information. 

Humans’ capacity for lid, or perceptual lid, has drawn the attention of researchers 
from engineering, linguistics, and psychology. The typical method of research is to 
conduct perceptual experiments with stimulus signals that are supposed to contain 
prosodic information of certain languages but not contain segmental information. In 
other words, the signals are used as ‘acoustic evidence of prosody’ in the argument. 
The stimuli used in the experiments have been various and not consistent across 
researchers. The critical question here is whether the signals used really represent the 
prosody of language, or more specifically what represents prosody acoustically. 

In this paper, I argue for the use of the Linear Predictive Coding (lpc) residual 
signal as the stimulus in perceptual lid. lpc is a basic technique used for analyzing 
and resynthesizing speech. It is based on the source-filter theory of speech produc¬ 
tion and separates the speech signal into a residual signal representing the source part 
and lpc coefficients representing the filter part. I argue that the lpc residual signal, if 
its intensity is adjusted, represents the prosodic information in the speech signal, and 
thus it is feasible to use it to test for the role of prosody in perceptual lid. I discuss 
in particular the factors relevant to syllable structure, citing data from the research 
recently carried out by the lid research group at Sophia University. 

1. stimuli in perceptual lid research. A variety of signals have been used as 
stimuli in perceptual experiments. Most studies have used signals that were presumed 
to represent the prosody of speech. In the experiments, researchers played a stimulus 
and asked their subjects to choose one from a given set of a small number of languages 
or dialects, or adopted some other method of observing the subjects’ responses. The 
stimuli to represent prosody used in previous experiments include lowpass-filtered 
speech (Atkinson 1968; Moffah & Roach 1988; Mugitani et al. 2000), laryngograph 
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output (Maidment 1976,1983; Moftah & Roach 1988), triangular pulse trains or sinu¬ 
soidal signals (Ohala & Gilbert 1979; Barkat et al. 1999), LPC-resynthesized or residual 
signals (Foil 1986; Mori et al. 1999), white-noise driven signals (Mori et al. 1999), and 
resynthesized signals preserving or degrading broad phonotactics, syllabic rhythm, or 
intonation (Ramus & Mehler 1999). Although all of these experiments showed that 
‘prosody’ plays a role in lid, the stimuli used differ from each other in the amount of 
information they carry; that is, the acoustic definitions of prosody are not coherent 
across studies. An appropriate selection of stimuli is needed for further research. 

It is clear that none of the stimulus signals used in the preceding studies, except for 
the lpc signals and Ramus and Mehler’s signals (1999), properly represent prosody in 
speech. In lowpass-filtered speech, some segmental information is preserved under the 
cutoff frequency, pitch sometimes rises higher than the cutoff, and intensity is not pre¬ 
served (ibid). The laryngograph output is an indication of short-term variations of glot¬ 
tal electrical resistance and virtually uninfluenced by supraglottal resonance and noise 
source (Moftah & Roach 1988). This means that it is not representative of outputted 
speech, which we actually hear in usual situations. Due to the loss of resonance and 
noise source, it does not contain sonority information, the importance of which I argue 
for in this paper. In the simulation of prosody with pulse or sinusoidal trains, the noise 
source is not taken into account, either. The white-noise driven signal keeps the inten¬ 
sity contour of the original speech but does not have any other information, e.g., pitch. 

Signals made by lpc technique are new in the history of research on perceptual 
lid. The idea can be traced back to Foil’s (1986) experiment, but it was simply a prepa¬ 
ratory test for developing an automatic lid system. Foil resynthesized speech by lpc 
with its filter coefficients constant, resulting in the speech signal that had a constant 
spectrum all the time. Mori et al. (1999) was the first to apply the lpc technique in 
the research on perceptual lid. In the lid test of English and Japanese, they used a 
residual signal with its intensity adjusted so that it had the same intensity contour as 
the original speech. Theoretically, the lpc technique can remove most of the spectral 
contour information, and it creates a signal that sounds like muffled speech and is 
unintelligible enough for a perceptual lid experiment. This signal is the topic of the 
discussion in this paper. 

The set of signals made by Ramus and Mehler (1999) should be noted, too. They 
conducted perceptual experiments on English and Japanese, controlling broad pho¬ 
notactics, syllabic rhythm, and intonation. They segmented the original English and 
Japanese speech into phonemes and replaced them by French phonemes to exclude 
the segmental cues to lid . They created four types of stimulus signals differing in the 
information they contain: ‘saltanaj’, ‘sasasa’, ‘aaaa’, and ‘flat sasasa.’ In ‘saltanaj’, all frica¬ 
tives were replaced by Is/, stops by /t/, liquids by III, nasals by /n/, glides by 1)1, and 
vowels by /a/. In ‘sasasa’, all consonants were replaced by Is/, and vowels by /a/. In ‘aaaa, 
all segments were replaced by /a/. ‘Flat sasasa’ was the same as ‘sasasa but its funda¬ 
mental frequency was made constant. The information that each stimulus contained 
and the results of lid tests are summarized in Table 1. Ramus and Mehler concluded 
that syllabic rhythm is a necessary and sufficient cue. 
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Intonation 

Syllabic 

rhythm 

Broad 

phonotactics 

Result of lid 

saltanaj 

+ 

+ 

+ 

successful 

sasasa 

+ 

+ 

- 

successful 

aaaa 

+ 

- 

- 

unsuccessful 

flat sasasa 

- 

+ 

- 

successful 


Table 1 . Stimuli and lid results of Ramus andMehler (1999).‘+’ indicates presence of 
cue, and ‘-’indicates absence of cue. 

All of the studies cited so far have argued for the importance of prosody, but the 
stimuli used differ from each other in terms of which acoustic features they contain, 
as discussed above, and thus the evidence that their arguments are based on, is not 
coherent. Further, the consideration of the acoustic properties of prosody through 
the comparison of stimuli has not been seen except for Ramus and Mehler (1999) and 
Mori et al. (1999). 

2. prosody in linguistics . Linguistic features that constitute a prosodic typology 
include accent (stress, pitch, and tone), intonation, and rhythm. Their acoustic corre¬ 
lates are, basically, fundamental frequency, intensity, and length. However, the notion 
of rhythm is quite controversial, and therefore the determination of its acoustic cor¬ 
relates needs further consideration. 

Traditionally, the rhythm types of languages are classified into ‘stress-timed’, ‘syl¬ 
lable-timed’, and ‘mora-timed.’ However, the manifestation of these different rhythms 
has not been clear in the light of experimental phonetics. For example, Japanese is 
said to be mora-timed, but researchers do not agree on how this is defined (Warner 
& Arai 2001a). The claims that have been made so far may be classified into timing 
hypotheses, rhythm hypotheses, and other alternatives. The timing hypotheses 
assume that the mora is an isochronic unit or that the length of the higher level 
structure such as a word is predictable from the number of morae in it. The rhythm 
hypotheses claim that the rhythmic difference is the reflection of structural factors, 
such as syllable structures, phonotactics, etc., rather than timing specifically. The 
experiments by Ramus and Mehler (1999), discussed in Section 1, support this claim: 
They define ‘syllabic rhythm as the temporal alignment of consonants and vowels, 
which is the reflection of syllable structures, and show that it is essential to the per¬ 
ceptual discrimination of languages. Besides these hypotheses, there are alternative 
claims, such as Tajima’s (1998) focusing on the competence of coordinating units in 
speech production, or Cutler and Otake’s (1997) discussion of the role of the unit 
in perception. In the present paper, I adopt the rhythm hypothesis, following Ramus 
and Mehler (1999), Ramus et al. (1999), Warnar and Arai (2001b) among others. 

Assuming that syllable structure is a contributor to rhythm, I claim that the sonor¬ 
ity feature is prosodic linguistically because the syllable structure can be identified by 
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Source-filter theory: 

Source 


Filter 

_ ^^ 


Speech ) 

lpc: 

Residual signa 
/ Pulse 
\ White noise, 

l Coefficients 


Acoustic features: 


Pitch 

Harmonics/noise 

alternation 

+ 

Intensity 


Spectral envelope 


Linguistic features: 
(approximation) 


Prosody 


Segmentals 


Figure 1. Correspondence of source-filter theory, lpc, acoustic features, and linguistic 
features. 

sonority contours with the Sonority Sequency Principle (cf. Kenstowicz 1994) phono- 
logically. However, the sonority feature is also segmental by nature because it is closely 
related to the articulatory manner of segments. It also acts to introduce some broad 
phonotactic information. My claim is that the sonority feature is ambivalently prosodic 
and segmental, and that it is an important cue to lid because it constitutes rhythm. 

3. LINEAR PREDICTIVE CODING (LPC). 

3.1. model, lpc analysis is based on the source-filter theory, and it separates the 
speech signal into the residual signal and coefficients, representing the source and 
filter respectively. The residual signal has a flattened spectrum, being similar to 
pseudo-periodic pulses for vowels and white noise for consonants. Acoustically, the 
residual has information on the fundamental frequency and the alternating pattern of 
harmonic structures and noise, while the coefficients have information on the spec¬ 
tral envelope. The intensity information is not carried by the residual signal, so it must 
be added later for use in perceptual lid experiments. The residual signal, after the 
addition of intensity, can be regarded as roughly corresponding to prosodic features 
at the linguistic level, and the coefficients to segmental features. See Figure 1. 

Note that lpc is a mathematical operation dividing a signal into the source func¬ 
tion and the transfer function, which do not exactly each correspond to the glottis 
and the vocal tracts activities in speech production. The residual signal is not repre¬ 
sentative of the glottal activity, as is the laryngograph output, but it rather represents 
the prosodic features of the speech emitted from the mouth. 

It is also important to note that the simple dichotomy at the level of linguistic fea¬ 
tures pictured in Figure 1 is just an approximation. It is impossible to relate prosodic 
and segmental features to non-overlapping sets of acoustic features. First, the sonor¬ 
ity feature at the linguistic level is both prosodic and segmental by nature, as already 
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discussed in Section 2. Consequently, the acoustic features representing sonority con¬ 
tribute to both prosodic and segmental features. Second, while acoustic features such 
as pitch, intensity, and length are essential to prosody, they also contribute to the per¬ 
ception of segments. (Length is not explicitly mentioned in Figure 1; it is represented 
as the temporal change of any feature.) These facts raise a problem when we try to 
extract acoustic correlates of prosody for perceptual lid experiments. Theoretically, it 
is impossible to completely separate prosody from segmental properties at the acous¬ 
tic level. The signal representing prosody must contain the information on pitch, 
intensity, length, and sonority, but if it does, it then also contains some segmental 
information. There can be no acoustic signal that carries only prosodic features and 
does not carry any segmental features. Therefore, it is important to seek the signal that 
contains enough prosodic information but reasonably little segmental information as 
a practical approximation. 

3.2. residual signal. If intensity information is added, the lpc residual signal acous¬ 
tically retains the fundamental frequency contour, the intensity contour, and the 
alternating pattern of harmonic structures and noise components. The fundamental 
frequency contour and the intensity contour are the perceptual cues to prosodic 
features in general. The presence/absence of harmonic structures and noise compo¬ 
nents together with intensity information may serve as the perceptual cues to vowel/ 
consonant distinction and major classes of consonants, which in turn define the 
sonority of segments that indicates syllable structures. 

The lpc residual is expected to represent sufficient prosodic features, including 
sonority, while it effectively suppresses segmental features. 

4. perceptual tests with lpc residual. Perceptual tests support the idea that lpc 
residuals represent prosodic features including sonority, or syllable structures, while 
effectively suppressing segmental features. The result of an experiment on Japanese 
consonant perception showed the identification rate of major classes, which define 
sonority, was high, while that of phonemes was low (Komatsu, Tokuma, et al. 2000). 
Another experiment using Japanese, English, and Spanish consonant clusters also 
yielded the result that the identification rate of major class features was much higher 
than that of individual phonemes (Komatsu, Shinya, et al. 2000). In an lid experi¬ 
ment, the lpc residuals made from whole speech achieved much higher scores than 
the lpc residuals whose consonantal sections were muted (Komatsu et al. 2001), sup¬ 
porting the importance of sonority. 

4.1. Japanese consonants. Komatsu, Tokuma, et al. (2000) conducted experiments 
on Japanese consonant perception with several signals including the lpc residual 
signal. They created the residual signal from 17 Japanese /C/+/a/ syllables (/ka/, /ga/, 
/sa/, /za/, //a/, /d3a/, /ta/, /da/, /tja/, /na/, /ha/, /ba/, /pa/, /ma/, /ja/, /ra/, /wa/) by lpc (sam¬ 
pling rate: 16 kHz; order of lpc: 22) and adjusted the intensity of the residual signals 
to match their original samples. The spectrum of the residual was tilted at -6 dB/oct 
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[consonantal] 

[approximant] 

[sonorant] 

Sonority 

Obstruents 

+ 

- 

- 

Low 

Nasals 

+ 

- 

+ 


Liquids 

+ 

+ 

+ 


Glides 

- 

+ 

+ 

High 


Table 2. Major class features (Komatsu, Tokuma, et al. 2000). 


Phonemes 

20.0% 

Major classes 

66.4% 


Table 3. Identification rates by Japanese speakers in Komatsu, Tokuma, et al. (2000). 

to make it sound like speech rather than noise. They presented these signals in per¬ 
ceptual experiments to 15 native speakers of Japanese and 4 native speakers of English 
learning Japanese as L 2 . 

They obtained the subjects’ correct identification rates of phonemes and major 
classes (see Table 2 for their definitions). The rates are calculated in such a way that, 
if /ka/ is perceived as /ta/, it is counted as wrong for the identification rate for pho¬ 
nemes, but counted as correct for the identification rate for major classes because 
both are obstruents. 

The results from Japanese native speakers are as shown in Table 3. The identification 
rate of major classes is high, while that of phonemes is quite low. Considering the fact 
that the identification rate of phonemes cannot be as low as V 17 (5.9 %) because the 
sonority is the cue to the manner of articulation, the results indicate that segmental 
information is effectively suppressed. 

Another noticeable observation found in these results is that there was a mishear¬ 
ing tendency for sonority to be perceived lower. However, we cannot discuss the 
identification rates of individual major classes or major class features because the 
number of samples was not balanced for major classes or major class features but was 
balanced for phonemes. 

In their experimental results, English native speakers generally showed better 
identification rates than Japanese native speakers. This suggests that the sensitivity to 
certain acoustic properties maybe different depending on the speaker’s first language, 
though there was too small a number of English native speakers as subjects to make 
a definite statement. 

4.2. CONSONANT CLUSTERS OF JAPANESE, ENGLISH, AND SPANISH. KomatSU, Shinya, 

et al. (2000) conducted perception tests on consonants in consonant clusters to see 
the effects of phonotactic constraints. They created the residual signal by the same 
signal processing method as Komatsu, Tokuma, et al. (2000). They used words that 
begin with consonant clusters as shown in Table 4. Note that Japanese stimuli include 
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Word-initial clusters Number of types Word length 

Japanese 

C 14 2 or 3 morae 

Cl 11 

eye 17 

English 

C 21 1 or 2 syllables 

CC 39 

CCC 9 

Spanish 

C 18 2 syllables 

CC 11 


Table 4. Stimulus words (Komatsu, Shiny a, et al. 2000). 

palatalized consonants and consonant-vowel clusters where the vowel is devoiced 
because Japanese does not have consonant clusters at the word onset. They provided 
these samples to the native speakers of the language of each stimulus: 15 Japanese 
speakers, 4 English speakers, and 1 Spanish speaker. 

The results showed that the identification rates of major class features were much 
higher than those of phonemes across all languages and cluster types. 

The results also showed cross-linguistic differences. English native speakers showed 
better identification rates than Japanese native speakers, as in the experiments by 
Komatsu, Tokuma, et al. (2000). The results were also affected by phonotactic con¬ 
straints, which are language dependent. 

4.3. lid with/without consonant sections. Komatsu et al. (2001) investigated the 
effects of the information that consonant sections have on perceptual lid using 
the residual signal. They created the residual signal from 10-second chunks of English 
and Japanese spontaneous speech. They made the residual signal by lpc (sampling rate: 
8 kHz; order of lpc: i6),lowpass-filtered the signals at 1 kHz to ensure spectral removal, 
and adjusted the intensity of the signals to match their original samples. They made 
two types of residual signals: the signals that consist of entire speech chunks and the 
signals in which the consonant sections were suppressed to silence. They conducted the 
perceptual lid tests with 32 Japanese monolinguals and 10 Japanese-English bilinguals 
for each type of signal. The subjects were asked to choose one from ‘English’, ‘Probably 
English’,‘Probably Japanese’, and‘Japanese’ after listening to each stimulus. 

The results are shown in Table 5 (overleaf). The index of discriminability (D 
index) was calculated in such a way that ‘English’ and ‘Japanese’ were scored as ±2 
while ‘Probably English’ and ‘Probably Japanese’ were ±1. Positive values indicate cor¬ 
rect responses; and negative, incorrect ones. The averaged D index ranges from -2 to 
+2, where o indicates random responses. It is evident that consonant sections play 
an important role in lid. The loss of the sonority features of consonants drastically 
reduces the accuracy of lid. 
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Japanese subjects Bilingual subjects 

Stimulus with consonant sections 

1.17 1.24 

Stimulus without consonant sections 

0.35 0.23 


Table 5. D indices in Komatsu et al. (2001). 


These results also suggested variation due to the properties of languages and sub¬ 
jects’ linguistic knowledge. 

5. conclusions. This paper has argued that the lpc residual signal, if its intensity is 
adjusted, represents the prosodic features in speech signal and that it is usable to test 
for the role of prosody in perceptual lid. It has paid special attention to the factors 
relevant to sonority or syllable structure. 

The sonority feature is both prosodic and segmental by nature. Because it indicates 
syllable structures, and hence constitutes rhythm, it is important for perceptual lid. 
It was shown that the lpc residual signal with its intensity adjusted contains prosodic 
features including sonority while suppressing segmental features (see sections 4.1 and 
4.2). The importance of sonority features was attested (see section 4.3). Therefore, the 
lpc residual signal properly represents prosody, and it provides the ‘acoustic evidence 
of prosody’ in perceptual lid research. 

Comparing the lpc residual to other stimuli used in preceding studies, it may be 
regarded as the addition of noise source to Ohala and Gilbert’s signal (1979), which 
represents the prosody of only voiced sections. 

Some of Ramus and Mehler’s signals (1999) have information similar to that in the 
lpc residual. However, their approach to the issue of how to extract prosody is funda¬ 
mentally different from the method using lpc. They first divide speech into segments 
such as phonemes, and then substitute those segments by others to exclude the seg¬ 
mental effects of the original speech. Here, they assume some phonological unit into 
which they can segment speech and operate on the sonority of this unit. On the other 
hand, the lpc analysis does not assume any segmental unit, and it directly operates on 
the continuous change of acoustics in speech signal. Operations by Ramus and Mehler’s 
method and lpc may make a difference when the acoustic properties of segments are 
overlapped temporally, or when phonological units should not be assumed for some 
research purpose. Ramus and Mehler’s method is more sophisticated than the lpc 
method, but they are not pursuing the same thing. Ramus and Mehler’s approach is 
more ‘phonological’, and the approach using the lpc residual is more ‘phonetic.’ 

The lpc residual thus can provide another experimental paradigm for perceptual 
lid research, or the perceptual study on prosodic typology. 


Special thanks go to the coauthors of the works cited in this paper, Takayuki Arai, Won 
Tokuma, Shinichi Tokuma, Takahito Shinya, Miyuki Takasawa, Kazuya Mori, and Yuji 
Murahara. I also thank John Hogan and Lois Stanford for reading the draft of this paper 
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THE LAH PARTICLE IN SINGAPORE ENGLISH: 
A RELEVANCE-THEORETIC APPROACH 


Vivien Soon Lay Ler 
National University of Singapore 


discourse particles in Singapore English, such as lah, lor, hah, and hor, which 
give Singapore Colloquial English (sce) its special flavour, belong to the most fre¬ 
quent words used in spontaneous dialogues. They fulfill many pragmatic functions 
with respect to a number of linguistic and interactional domains. The meanings and 
functions of these particles have been the subject of previous discussion (e.g., Kwan- 
Terry 1978, Platt 1987, Gupta 1992, Pakir 1992). 

In the present paper, I concentrate on the use of the most frequently occurring 
discourse particle in sce, the particle lah. In the top-five list of particles in the spoken 
categories in ice-sin 1 , lah ranks first with 1,742 occurrences. The particle that ranks 
second is ah with 1,242 occurrences, followed by hah with 256 occurrences. 

People use lah to convey the mood and attitude of the speaker (oed 2000). For 
example, ‘Go to Chinatown lah’ is used as a suggestion. In Brown (2000:127), lah is 
used with a request or command to indicate impatience (e.g.,‘Finish your food lah’) 
or to turn the utterance into a plea (e.g., ‘Give me more time lah’). Lah has many 
functions (cf. Section 1). Is there a single meaning of lah compatible with the varying 
uses of lah in actual discourse? That is, what kind of cognitive information does lah 
encode? The answers to these questions will unlock one of the most interesting mys¬ 
teries in intercultural pragmatics. 

I argue that we need a unified meaning of lah to understand how it is interpreted 
in discourse. This can be done by a study of the cognitive processes involved in 
utterances concerning the meaning of lah in terms of three fundamental issues: is 
the meaning of lah conceptual or procedural; is it truth-conditional or non-truth- 
conditional; and how does it constrain the utterance. The data in my study are from 
two main sources. The first is the recently completed ice-sin, which is by far the most 
comprehensive collection of Singapore English. The second is personal conversations 
or statements overheard at churches or canteens and recorded either at the time or 
immediately thereafter. 

1. previous accounts. Lah has been discussed by many writers, but there is consid¬ 
erable disagreement as to its use and functions. Previous descriptions of lah are based 
on a few examples; only Richards and Tay (1977) included a telephone conversation 
in their study, and they used only a fragment of the whole. 

The earliest account of lah is by Tongue—depending on the way it is pronounced, 
lah can function as an ‘intensifying particle, as a marker of informal style, as a signal 
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of intimacy, for persuading, deriding, wheedling, rejecting and a host of other pur¬ 
poses’ (1974:114). Tongue’s account of lah is the first to treat it as characteristic of sce. 

Lah is treated as a marker of rapport or solidarity (Tongue 1974; Richards & Tay 
1977; Kwan-Terry 1978; Bell & Ser 1983; Pakir 1992) and as a marker of emphasis 
(Richards & Tay 1977; Platt et al. 1983; Loke & Low 1988). It has also been said to com¬ 
municate a range of attitudes, such as obviousness, persuasion and impatience. Other 
functions adduced include friendliness, hostility, and annoyance. It is also described 
as an indicator of enthusiasm and assertion or as a word communicating objection. 
This list is not exhaustive. The main functions, as described by past researchers, are 
shown in the following examples. (Unless indicated otherwise, the examples are from 
my personal collection or from ice-sin). 

1.1. solidarity. Lah is first described as a marker of rapport or solidarity, comparable 
to the English ‘filler’you know: 

(1) Don’t be shy lah. [We are friends] 

(2) No use trying to hide our roots lah. [We are Singaporeans] 

However, this classification cannot accommodate data such as (3), where no element 
of rapport or solidarity can be detected: 

(3) Context: A mother (A) and her daughter (B) have a disagreement on 
who is to buy Mandarin oranges. (It is customary for the Chinese to 
exchange Mandarin oranges when visiting during the Chinese New Year). 
A: Then after that it’s the Lunar New Year special lah. 

B: So? 

A: Ya lah, then during that period we can go what? 

B: Cannot lah. Aiyah, when I wash my hair, I don’t want to go out. 

Dirty my hair lah. 

A: You bring one of them lah. (iCE-siN-siA-007) 

1.2. emphasis. According to some researchers, lah contributes an element of emphasis 
to sentences such as (4) and (5): 

(4) Do you want to go? I’m not going lah. [Emphasis] 

(Kwan-Terry 1992:69) 

(5) Normal doctors lah who are on our medical panel, [not specialists] 
(iCE-siN-siB-073) 

But emphasis does not explain (3). 

1.3. obviousness. Lah is often said to convey the speaker’s attitude of‘obviousness’. 
There is also a note of impatience or annoyance in these cases: 



THE LAH PARTICLE IN SINGAPORE ENGLISH 


289 


(6) They generally don’t take beef lah. [It’s obvious; everybody knows that.] 
(iCE-siN-siA-023) 

(7) I mean of course it changes lah. (iCE-siN-siA-065) 

Obviousness might explain (3), but it does not explain (1) and (2). 

1.4. persuasion. Lah can also be used with a certain tone to persuade or to suggest: 

(8) Come with us lah. [Won’t you?] (oed 2000) 

(9) Go to Chinatown lah. [Why don’t you?] (iCE-siN-siA-007) 

Again, persuasion does not explain (3). 

1.5. friendliness. Lah is sometimes used when the speaker wants to be friendly: 

(10) Okay, doesn’t matter lah. [It’s all right; we’re friends.] (iCE-siN-siA-091) 

(11) Quite nice lah. [I’m your friend; consider my opinion.] 

(iCE-siN-siA-023) 

Friendliness does not explain (3) either. 

1.6. hostility. Sometimes, lah is described as conveying a sense of‘hostility’: 

(12) If you want then it should be after this week lah. [Not earlier!] 
(iCE-siN-siA-091) 

(13) I don’t want to eat lah. [Don’t force me!] 

The list of functions ascribed to lah in (1)—(13) is not exhaustive. But simply calling 
it polysemous or multifunctional is inadequate. This can be seen by comparing the 
same utterance with and without lah : 

(14) a. I mean of course it changes lah). (iCE-siN-siA-065) 
b. I mean of course it changes! 

The ‘obviousness’ in (14) is evident even without the particle. It is in the semantics of 
the utterances, as indicated by of course in (14). Hence,‘obviousness’ cannot be charac¬ 
terised as an inherent part of lah. Similarly‘persuasion’ in (8) and (9), ‘friendliness’ in 
(10) and (11), and‘hostility’in (12) and (13), may have other sources, such as intonation 
or tone. These meanings (in (8)—(14) are preserved even when lah is omitted. Thus 
these functions are not inherent in the particle itself. 

What is most striking about the above descriptions is that they tell us different 
things about the particle. Some of the descriptions contradict others (e.g., friendliness 
vs. hostility). The reason for the varied descriptions is that none of them give a complete 
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picture of lah. Thus, while such findings shed some light on what lah does, a general 
description does not yet exist. Specifically, how does lah operate in discourse? Nor 
have its contradictory functions in imperatives (a softening effect) and declaratives (a 
strengthening effect) been explained. In my account, we shall see why this is so. 

2. A RELEVANCE-THEORETIC ACCOUNT. 

2.1. the lah particle. Lah can be attached to declaratives in (15), imperatives in 
(16), and some interrogatives, e.g., (17). The range of lah is shown in (i5)-(i8). It was 
first shown by Bell and Ser (1983) that lah need not necessarily occur in sentence- or 
clause-final position, as shown in (i8)a-c (Bell & Ser 1983) and in (i8)d. 

(15) a. She’s quite playful lah. (iCE-siN-siA-091) 

b. No lahl This way cannot! Miss turn already. ( The Straits Times 6 Apr 2001) 

(16) a. Come on lah. (iCE-siN-siA-065) 

b. Bring one of them lah. (iCE-siN-siA-007) 

(17) What’s in fashion lah ? (iCE-siN-siA-003) 

(18) a. Must lah have been cooking. 

b. Must have been lah cooking. 

c. That great hawker lah from Newton Circus. 

d. Normal doctors lah which are on our medical panel. (iCE-siN-siB-073) 
However, there are constructions that do not allow lah. Consider (19-21): 

(19) *Are you going home lah ? 

(20) *He’s asleep lal z? 

(21) *Where are you going lal z? [seeking information] 

Examples (i9)-(2i) illustrate the unacceptability of using lah in yes/no interrogatives 
(19), declarative interrogatives (20), and wh-interrogatives seeking factual informa¬ 
tion (21). These restrictions are not in the literature, and an explanation should be part 
of the full description of lah 2 . 

I suggest that lah is a procedural, non-truth-conditional particle that contributes to 
the explicature of an utterance. It encodes procedural information about the speaker’s 
desire for the hearer to recognise the shared assumption(s) behind the utterance. 
To explain this I turn to those ideas of relevance theory that are pertinent to my 
account, specifically to the notion of constraints on relevance (Sperber & Wilson 
1995; Blakemore 1987). 

2.2. theoretical framework. A recently developed pragmatic framework, Rel¬ 
evance Theory (Sperber & Wilson 1995), is a general theory of communication based 
on cognitive principles. It looks at utterances as inputs to inferential processes which 
affect the cognitive environment of the hearer. In this account of communication, 
interpretation of utterances is not merely a matter of linguistic decoding but relies 
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heavily on inference. The process of utterance interpretation is governed by the 
principle of relevance, that is, every act of ostensive communication communicates a 
presumption of its optimal relevance’ (Sperber & Wilson 1995:176). 

Relevance is seen as a combination of the effects gained and the cost of process¬ 
ing the utterance (the greater the contextual effects, the greater the relevance; the 
smaller the processing effort to arrive at the intended interpretation of an utter¬ 
ance, the greater the relevance). In utterance interpretation, the hearer begins by 
decoding the utterance linguistically, which often involves reference assignment, 
disambiguation and enrichment 3 . If the hearer does not find the explicit context 
optimally relevant, he will search for other contextual assumptions (by calling to 
mind accessible premises) which will enable him to reach the intended conclusion. 
Consider (22): 

(22) The coach was cold. 

In (22), the sentence is ambiguous: ‘coach’ can refer to someone who coaches a sport 
or to a vehicle, and ‘cold’ can be a certain physical condition or a certain attitude. In 
the right context, (22) is no longer ambiguous. Consider (23)a-b: 

(23) a. We started basketball lessons last week. 

b. The game was fun but the coach was cold. 

Another important postulate of Relevance Theory is that expressions in language can 
be seen to encode not only concepts but also procedures. Such expressions guide 
the hearer in the process of utterance interpretation and contribute to relevance by 
reducing the processing effort needed to reach the intended interpretation (Blake- 
more 1987). Lah encodes procedural meaning. 

The processing effort is reduced by the effect of constraints on relevance (Sperber & 
Wilson 1995), i.e., by making the hearer’s context set smaller. The analysis of (24) and 
(25) provides a better understanding of constraints on relevance. 

(24) Benjamin Bratt likes to please fulia Roberts. 

(25) He loves Julia Roberts. 

(24) and (25) can be construed as being in a variety of relations. For example, they 
could be just two facts or beliefs, or one could be construed as giving evidence for 
the truth of the other. In that case, one is the conclusion, while the other supports it. 
Because either can be the conclusion, there is the possibility of misinterpretation. In 
such cases, Blakemore (1987) argues that constraints on relevance play a vital role. 
Consider (26) and (27): 

(26) a. Benjamin Bratt likes to please Julia Roberts, 
b. After all, he loves Julia Roberts. 
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(27) a. Benjamin Bratt likes to please Julia Roberts, 
b. So, he loves Julia Roberts. 


Although the two utterances are in the same order in (26) and (27), they are related in 
different ways. In (26), (26)a is the conclusion with (26)b providing the evidence. In 
(27), (27)b is the conclusion with (27)a giving the evidence. In both cases the speaker 
expects the hearer to have further contextual assumptions available, and they are not 
the same for (26) as they are for (27). Thus, in (26), the speaker expects the hearer to 
access assumption (26'): 

(26') If X loves someone then X likes to please this person. 

In (27), however, the assumption is (27'): 

(27') If X likes to please someone then X loves this person. 

Thus after all and so constrain the processing of the two utterances in different ways. 

Blakemore argues, on the basis of examples such as (26) and (27), that words such 
as so and after all do not contribute to the truth-conditional content of the utterance 
in which they occur and that they do not encode conceptual meaning.Their role is to 
help ‘constrain the hearer’s choice of context for its interpretation (1987:141). 


3. analysis and discussion. It is now possible to interpret the particle lah using a 
Relevance Theory approach. Consider (28): 


(28) Context: A and B are discussing how the economic downturn has affected 
business and as a consequence organisations have to be prudent to 
protect the interests of shareholders. 

A: So you know we are not spared lah. Okay we are not spared lah. 

B: Uhm nice to know that I am not alone in all this. 

A: You are not spared okay. (iCE-siN-siB-077) 

(29) As contextual assumptions include 5 premises: 

Premise 1. The economic downturn has affected As business. 

Premise 2. A knows that other businesses have been also affected by the 
downturn. 

Premise 3. A wants B to know that his business has been affected by the 
downturn. 

Premise 4. A wants to assure B that he has his sympathy. 

Premise 5. A knows that he has to be prudent. 


As intention in the utterance is not only to inform B that they are not spared the 
consequences of the economic downturn but also to indicate the speaker’s desire 
for the hearer to recognise the shared assumption Premise 4. This is effected by lah. 
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The shared assumption is recognised when B says that it is nice to know that he is 
not alone in all this. B feels reassured. Let us now look at a lah- appended utterance 
involving an interrogative: 

(30) Context: A and B are discussing the latest fashion. 

B: What is in thing? 

A: What’s in fashion lah ? 

B: Couldn’t really identify any fashion. (iCE-siN-siA-003) 

(31) A’s contextual assumptions include 5 premises: 

Premise 1: B has not given A the right response. 

Premise 2: A is not happy that B does not know what is the ‘in thing’. 

Premise 3: A feels that B should know what she means. 

Premise 4: A wants B to know that she disapproves of B’s ignorance. 

Premise 5: A feels that she does not have to spell out what ‘in thing’ is. 

Lah leads the hearer to access and consider assumptions implicitly communicated 
by an utterance, assumptions that may or may not be retrieved in the absence of the 
particle. The assumption made accessible by A’s comment in (30) is something like 
Premise 4 in (31). The hearer would choose the first interpretation coherent with the 
principle of relevance. B knows that A is reproachful of her (B’s) ignorance and is 
showing it indirectly. In attaching lah to her utterance, A shows that she expects more 
from B than just an answer to her question. B’s reply ‘Couldn’t really identify any 
fashion’ appears to answer the question as part of the ‘repair’ to her recognition of 
A’s reproach. B’s reply would most probably be accompanied by non-verbal linguistic 
expressions such as a soft tone, an apologetic look, thus indicating that she knows 
what A meant. Lah helps convey the speaker’s desire for the hearer to recognise the 
shared assumption made manifest in the context (Premise 4, which is similar to what 
B has in her cognitive environment). In other words, the speaker desires that her 
(informative) intention to make manifest the shared/common assumption be fully 
recognised by the hearer. 

Our analysis that lah signals the speaker’s desire that the hearer recognise the 
shared assumption behind the utterance can explain all the communicative effects 
ascribed to lah in the literature (cf. Section 1). It accounts for the solidarity and rap¬ 
port felt between communicators. If I make known to you that there are shared 
assumptions between us, I am treating you as someone I can relate to, as a member of 
a certain community which is also mine. In so doing, I create an impression of rap¬ 
port between us. 

The present account also explains why (8), ‘Come with us lah’, is seen as more 
polite than ‘Come with us’. That is, lah makes imperatives more polite (‘weakening’ 
imperatives). In (8), lah invites the hearer to recognise the speaker’s desire to have the 
same shared assumption behind the utterance. To explain this, it is necessary to look 
at what mood a sentence encodes, and how this affects the way relevance is achieved. 
According to Sperber and Wilson (1988), the indicative mood shows that the thought 



294 


VIVIEN SOON LAY LER 


communicated by the utterance is entertained as a true description of an actual state 
of affairs. The imperative mood, on the other hand, indicates that the thought com¬ 
municated is entertained as a true description of a potential and desirable state of 
affairs. In a command or request, lah has the effect of asking the hearer to find and 
accept how relevant it is for the speaker to achieve some potential and desirable state 
of affairs. In a suggestion or persuasion, lah has the effect of asking the hearer to find 
and accept how relevant it is for the hearer to achieve some potential and desirable 
state of affairs. 

The present account also explains why lah added to an imperative can appear to be 
persuading or pleading. In (8), lah instructs the hearer about the speaker’s desire to have 
the hearer recognise the shared assumption behind the utterance. In a context where the 
hearer appears unable to recognise the speaker’s intention to draw on a shared assump¬ 
tion (e.g., it will be good for the hearer to go along with them), lah can be interpreted as 
an attempt to persuade the hearer to accept the speaker’s point of view. 

That lah appended to utterances adds an element of annoyance or impatience can 
also be accounted for in our explanation of the particle. If the hearer appears not to 
recognise the shared assumption as desired by the speaker, then the speaker’s insis¬ 
tence that he do so may have a touch of annoyance or impatience. This does not mean 
that lah contains annoyance or impatience in its semantic make-up. 

The various functions of lah as described by previous researchers, such as 
obviousness, friendliness, consultativeness, can be subsumed under our description 
of the particle. Lah encodes the speaker’s desire that the hearer recognise a shared 
assumption behind the utterance, which in turn, functions as an explicit guarantee 
of relevance. As a consequence of such an explicit guarantee of relevance, the hearer 
is encouraged to expand the contextual assumptions in order to obtain the intended 
contextual effects. 

I have shown so far that lah is used when the speaker intends to draw on the shared 
assumptions behind the utterance. There are two ways in which linguistic meaning 
contributes to the interpretation of utterances. It may encode conceptual meaning 4 , 
on the one hand, or it may contain procedural information, that is, instructions for 
processing propositions (Blakemore 1996:151). In the case of lah, the particle appears 
to belong to the group of linguistic entities that encode procedural meaning. In my 
analysis, lah does so by signalling the speaker’s intention that the shared assumption 
be fully recognised by the hearer. In turn, this functions as a guarantee of relevance. 

The distinction between truth-conditional meaning and non-truth-conditional does 
not equal the distinction between conceptual meaning and procedural meaning. For 
example, sentence adverbials such as ‘frankly’ and ‘seriously’ are treated as non-truth- 
conditional, but they encode conceptual meaning (Wilson and Sperber 1993; Blakemore 
1992). In the case of lah in sce, the particle can be omitted without affecting the truth- 
conditions of the host utterance. Consider (6), reproduced here as (32)a: 

(32) a. They generally don’t take beef lah. (iCE-siN-siA-023) 
b. They generally don’t take beef. 
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For both (32) a and (32)b, the proposition is the same, namely, that there is a group 
of people who generally do not eat beef. If the particle is omitted, as in (32)b, there is 
no loss in propositional meaning. Lah instructs the hearer about the speakers desire 
for the hearer to recognise the shared assumption behind the utterance, that is, what 
the speaker wants to say is not ‘They generally don’t take beef’ but that the speaker 
need not spell this out; the hearer should be able to gather that from the context. In 
this way, the presence of the particle allows the hearer to process the utterance in the 
smallest context, thereby making for optimal relevance. 

4. conclusion. The present account of lah has several advantages. First, it explains 
the difference between an utterance with lah and one without lah, which has not been 
adequately dealt with in the previous studies. A relevance-theoretic account of lah 
explains the difference in the following manner. By providing an overt guarantee of 
relevance, lah can guide the hearer to explore assumptions implicitly communicated 
by an utterance with lah, including contextual assumptions intended by the speaker. 
An utterance without lah does not have this encouragement. 

Next, I argue that lah is procedurally used by the speaker and the various uses of 
this particle can justifiably be subsumed under a single description of a code marker 
instructing the hearer to recognise the shared assumption behind the utterance. With 
this unified meaning, glosses of the type presented by the previous studies become 
redundant. 


The lexical corpus of ice-sin (International Corpus of English, Singapore component) 
was completed at the Department of English Language and Literature, the National Uni¬ 
versity of Singapore, in April, 2000. Its completion was achieved thanks to the nus- 
funded project A Study of Definite Noun Phrases in Singaporean and British Discourse 
(RP3982058). It is a one million-word corpus, consisting of 500 texts (200 written and 300 
spoken) of approximately 2000 words each. The data used in this study is taken mainly 
from the spoken texts. 

Space considerations preclude a discussion of the restrictions on lah. 

Semantic representations often have to be enriched. For example, in. ‘The bat is too grey’, 
the adverb too is semantically incomplete. The bat is too grey for something. (For further 
details, please refer to Sperber & Wilson 1995:188-89). 

The distinction between conceptual and procedural meaning is a useful one, as it expresses 
the intuition that there are different types of linguistically encoded information. It is also 
useful in explaining expressions that do not affect the propositional content of an utter¬ 
ance. Due to constraints of space in this paper, a detailed account is not possible. I refer 
the reader to the literature (Blakemore 1987,1992; Wilson & Sperber 1993). 
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WHAT LANGUAGE STABILITY TELLS US ABOUT 
LANGUAGE CHANGE, AND VICE VERSA 


Mary S. MacKeracher 
University of Toronto 


language change has been studied by many linguists, but language stability or 
the lack of change has received little attention. In this paper, I look at what S-curve 
change, the most common form of language change, predicts about language stability: 
stability at near-categorical levels is the only possible result of language innovation. I 
proceed by first presuming the internal-influence logistic model can adequately rep¬ 
resent S-curve change and then I expose some of its inherent assumptions. Next, 
using data from four projects of the Dialect Topography of Canadian English, I illus¬ 
trate that this model is appropriate for many of the variables investigated. 

However, some of the language variation patterns do not conform to this model. 
Stable variation, or stability at middling frequencies of use, is one such pattern. But 
when I slightly alter the initial mathematical model, this uncommon but not rare phe¬ 
nomenon is accommodated. The new model now predicts that language change will 
end in stability at either near-categorical or middling frequencies. 

This adaptation inspires us to ask if the model with additional modifications can 
be made to accommodate other less common change patterns. I show that with small 
alterations we can accommodate language reversal, that is, when an innovation fails 
to achieve lasting prevalence. 

We should note that language change is only one type of human innovation. 
Everett M. Rogers has created a theoretical paradigm on the basis of over four thou¬ 
sand works from many disciplines on human innovation. Rogers (1995:10) defines 
innovation diffusion as‘the process by which an innovation is communicated through 
certain channels over time among the members of a social system. The four elements 
are the innovation, communication channels, time, and the social system. I will dis¬ 
cuss language change using this paradigm. 

1. s-curve language change: the standard model. The most common pattern 
of language change is S-shaped (Bailey 1973:77, Inoue 1997:79, Labov 1994:65). The 
S-curve diffusion pattern is illustrated by Figure 1 (overleaf). 

An innovation begins slowly, with only a small percentage of the society using the 
new variant. When the innovation has been adopted by 10-20% of the population, it 
reaches a ‘critical mass’, the number of people needed for diffusion to take off, or more 
precisely, the point at which the innovations rate of adoption becomes self-sustaining. It 
then proceeds more quickly, as the majority of people adopt the innovation. And finally 
the change slows when almost all use the variant (Rogers 1995:261-63). 
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—»increasing time 

Figure 1. The S-curve diffusion pattern. 


This pattern can be represented by many mathematical models, the two most com¬ 
monly used being the logistic and cumulative normal statistical distributions (Grili- 
ches 1957:503, Labov 1994:65, Mahajan & Peterson 1985:10). For ease of manipulation, 
I work with the logistic distribution. The simplest version of this distribution that I 
feel reflects language change is the internal-influence model and it will serve as our 
starting point. 

2. the internal-influence logistic model. The fundamental premise of the inter¬ 
nal-influence logistic model is that the innovation propagates throughout the chosen 
society or population through ‘horizontal’ channels, that is, solely through personal 
contact between members of the society (Mahajan & Peterson 1985:17). The mecha¬ 
nism of change is the social interaction between prior adopters of the innovation and 
current-non-adopters-but-potential-adopters. Figure 2 gives the differential version 
of the internal-influence logistic model (adapted from ibid:i7—21). 

The first line is the actual equation. Below it are two notes; the first gives the 
boundary condition for the differential version if it is to be integrated and the second 
indicates what we solve for. Following the notes is a description of the variables, con¬ 
stants, parameters (actually just one parameter here), and some of the composite 
expressions in the equation. 

We express the number of people who adopt the language innovation as a func¬ 
tion of time: the range variable is N, the number of people who adopt the language 
innovation, and the domain variable is t, time. In the differential version, the incre¬ 
mental number of people who will adopt the innovation at time t forms the left side 
of equation. It is equal to the probability of adopting the innovation (the first term of 
the right side) multiplied by the number of potential adopters who have yet to adopt 
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dN(t) 

dt 


= bN(t) X [N-N(t)] 


1. with boundary condition: att = t 0 (start of the diffusion process), N(t 0 ) = No- 

2. where parameter b, indicated in bold, is to be determined. 


N(t) is the cumulative number of adopters of the innovation at time t 


—— is the rate of diffusion of the innovation at time t 

b is the rate of adoption of the innovation 

bN(t) is the coefficient of diffusion for the internal-influence model; it is the second term of 
the general coefficient of diffusion, g(t) = a + bN(t) + cN(t ) 2 + ... 

(consider the coefficient of diffusion as the probability of adoption of the 
innovation at time t) 

N is the total number of potential adopters in the social system 

[N - N(t)] is the number of potential adopters who have not yet adopted the innovation 
at time t (it is the difference between the total number of potential adopters, N, and the 
number of previous adopters, N(t), at time t) 


Figure 2. Differential version of internal-influence model (adapted from Mahajan & 
Peterson 1985:17-21). 


the innovation (the second term of the right side). We solve not for the variables t and 
N, which are known from survey data, but for the parameter b, the rate of adoption of 
the language innovation. 

Note that the integral version (not given in this paper) is what actually produces 
the S-curve of Figure 1. Because its meaning is not particularly transparent, I use its 
corresponding differential version here. 

3. THEORETICAL PREDICTIONS AND ASSUMPTIONS OF THE MODEL. It is my impression 

that those who discuss the S-curve model presume an innovation is in either a state 
of stability or a state of change (non-stability). Technically speaking, this is a notional 
or qualitative reading of the varying rates of change along the S-curve, but because 
this idea seems to be generally accepted, I shall presume that the model may be inter¬ 
preted in this way. 

Therefore, when we examine Figures 1 and 2, we see that S-curve change as rep¬ 
resented by the internal-influence logistic model makes two predictions about stabil¬ 
ity. First, stability occurs only before or after change, never as an intermediate stage 
of change. And second, stability occurs only at near-categorical levels of frequency of 
the population (close to the o or 100% asymptotes of N). 

In addition to these predictions concerning stability, the model makes many 
assumptions (Mahajan & Peterson 1985 passim; Rogers 1995:260), three of which I 
list here. First, as mentioned above, change or the rate of adoption of the innovation 
is solely a function of time. Second, also discussed above, the innovation propagates 
through internal, personal contact (internal influences), and not through any influ¬ 
ence from change agents outside of the population (external influences). And third, 
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Figure 3. Canadian English Dialect Topography projects. 

the innovation is ‘successful’, that is, all of the population adopts it and nothing sup¬ 
plants it (Rogers 1995:260). 

Looking at the predictions and assumptions together, we may question whether 
the S-curve model and our mathematical representation are in any way robust. Are 
they good models, that is, do they adequately reflect most if not all of the patterns we 
encounter in language variation? And we can pose more specific questions. First, how 
are sociolinguistic factors incorporated into the model? We sociolinguists would prefer 
a model that reflects the influence of these potentially explanatory factors to a model 
that merely says at a given time A, the number of adopters will be B. Second, what is the 
status of stable variation, that is, stability at middling frequencies? Is it a natural excep¬ 
tion to the model, or a case which ought to be modeled too? Third, what exactly is an 
unsuccessful innovation and can it be modeled? And fourth, what are external influ¬ 
ences and can they be incorporated into the model as well as internal influences? 

4. data and methodology. The data I use here are from the Dialect Topography of 
Canadian English (Chambers 1994 passim, inter alia). It is a language survey that has 
been completed in four regions to date, namely the Golden Horseshoe, the Ottawa 
Valley, Montreal, and Quebec City (see Figure 3). The instrument is a self-report 
postal questionnaire. It consists of: 11 personal types of questions (i.e., age, sex, educa¬ 
tion, places where the respondent is currently living and was living when aged 8 to 
18, birthplaces of the respondent and parents, and occupations of the respondent and 
parents); 4 English Language Use questions, for all projects except the Golden Horse¬ 
shoe; and 81 linguistic questions. 

I have analyzed 45 linguistic variables for the four projects. In particular, I have 
investigated the behavior of the ‘dependent’ linguistic variable with respect to the 
‘independent’ or potentially explanatory variables of region/central place, age, sex, 
social class, education, and three indices devised by J. K. Chambers: Regionality Index, 
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over 80 70-79 60-69 50-59 40-49 30-39 20-29 14-19 

Age of respondent 

Figure 4. Age chart for (peach) - pit. 


' pit-Golden Horseshoe (U.S.) 
pit-Golden Horseshoe (Can.) 
pit-Ottawa Valley 
pit-Quebec City 
pit-Montreal 


Occupational Mobility Index, and Language Use Index. In this paper, I discuss 4 vari¬ 
ables: (peach), (sick), (either), and (mom). And I will concentrate primarily on the 
behavior of the linguistic variable with respect to age. 

There are two major assumptions that I make here. First, I presuppose the critical 
age hypothesis, that an individual does not alter his or her speech after a ‘critical age’, 
say around puberty. I make this assumption in order to presuppose the apparent time 
hypothesis, that age can stand for time. I must make this latter assumption if I wish 
to use a cross-sectional instrument to study diffusion, an intrinsically longitudinal 
problem. The second assumption is that the standard or expected temporal pattern of 
language change is modeled by the internal-influence logistic model. 


5. VARIABLES THAT EXHIBIT THE STANDARD S-CURVE PATTERN. 

5.1. peach. The variable (peach), investigated in an open-ended question, is bivariate: in 
all areas, 85% of the respondents answered either pit or stone to the following question: 

In the middle of a peach you always find a _. 

As shown in Figure 4, the age chart for the innovative variant pit, its adoption pattern 
across apparent time (age) is a typical, complete S-curve. The adoption of pit begins 
slowly in the earliest times (oldest age groups), then speeds up in more recent times 
(middle-aged groups), and then slows in the most recent times (youngest age groups). 
The change ends in stability at near-categorical levels of adoption of this variant. 

5.2. sick. The variable (sick) concerns how the adjective sick is modified by the noun 
stomach. Results from the following closed-ended question reveal that the variable is 
not particularly variable: it is almost univariate. 
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over 80 70-79 60-69 50-59 40-49 30-39 20-29 14-19 

Age of respondent 

Figure 5. Age chart for (sick) - sick to my stomach. 


sick to-G. Horseshoe (U.S.) 
sick to-G. Horseshoe (Can.) 
sick to-Ottawa Valley 
sick to-Quebec City 
sick to-Montreal 


Which do you say? □ Yesterday I was sick at my stomach. 

□ Yesterday I was sick in my stomach. 

□ Yesterday I was sick to my stomach. 

□ Yesterday I was stomach-sick. 


In all areas, the frequency of sick to my stomach is at least at the low eightieth per¬ 
centile. Figure 5, the age chart for the innovative variant sick to my stomach, shows 
the variable is first changing or non-stable, then becomes stable in all regions. We 
see a typical S-curve in its last phase. The change ends in a long period of stability at 
near-categorical levels. 

Now let us begin to answer the questions posed in section 3. First, how are sociolin- 
guistic factors incorporated into the internal-influence logistic model of Figure 2? They 
are not incorporated directly into the model, but instead, indirectly: they affect the value 
of parameter b. If social factors conspire to promote the language innovation aggres¬ 
sively, the value of parameter b will be relatively large, but if social factors promote the 
innovation less aggressively, the value of parameter b will be relatively small. Simply put, 
the more rapid the change, the larger the value of parameter b. We should note that the 
value of parameter b is the only avenue in which social factors, the potentially explana¬ 
tory variables, are incorporated into the model. We may ask whether we can do better: 
can we involve the social factors more directly into the model and how? Let us see what 
happens when we try to alter the mathematical model to accommodate more language 
change patterns. 


6. TWO VARIABLES THAT DO NOT EXHIBIT THE STANDARD S-CURVE PATTERN. 

6.1. stable variation: either. The well-known bivariate variable (either) is investi¬ 
gated in the following closed-ended question: 

Is the ei of EITHER pronounced like the □ ie of pie, or the □ ee of bee? 
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[ij]ther-bee-G. Horseshoe (U.S.) 
[ij]ther-bee-G. Horseshoe (Can.) 
[ij]ther-bee-Ottawa Valley 
[ij]ther-bee-Quebec City 
[ij]ther-bee-Montreal 


over 80 70-79 60-69 50-59 40-49 30-39 20-29 14-19 

Age of respondent 

Figure 6. Age chart for (either) - [ij]ther-bee ([ijjther which rhymes with the word bee,). 


In all areas, there is an approximate 2-to-i split for the variants ‘[ij]ther -bee’ and 
‘[ajjther-pr'e’ respectively. For the innovative variant of (either), ‘[ij]ther-bee’, we find 
that, in the regions of the Canadian Golden Horseshoe, the Ottawa Valley, and Mon¬ 
treal, it is first changing or non-stable and then becomes stable. However as shown in 
Figure 6, the change ends in stable variation, stability at middling frequencies of use, 
and not in near-categorical adoption of the innovation as was seen for the variable 
(sick) in Figure 5. 

Stable variation is not a common phenomenon, but it is not rare, either. Can stable 
variation be incorporated into the model? The answer is yes, if we change two con¬ 
stants (t 0 and N) into parameters; in fact, both stable variation and staggered start 
times of different innovations are achieved with these alterations. As shown in Figure 
7 (overleaf), this solution (based on Griliches 1957 passim) has three parameters, s, 
b, and n, which correspond respectively to the ‘origin’ at the start of the process, the 
‘slope’ or rate of change during the process, and the ‘ceiling’ at the end of the process. 
Most importantly, they represent three ways in which social factors are incorporated 
into the model: the social factors which begin the innovation, those which develop it, 
and those which are at the end of the change. Please note that the three parameters 
cannot be compared with each other: they are measured differently, in time units, 
number of adopters per unit time, and number of adopters. 

There are important consequences of this solution: it predicts that stable variation 
is a result of S-curve language change and is not a temporary plateau in the midst of 
a change. When the initial social factors promoting the adoption do not alter during 
the change, or alter to ones that continue to promote the innovation, the adoption 
of the innovation increasingly develops until it reaches near-categorical levels. But 
when the social factors alter in such a way that they no longer promote the innovation 
during the change process, stable variation results. By considering stable variation as 
a form of stability, we needed to alter our initial language change model. Can we alter 
the model to incorporate other types of language changes? 
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dN(t) 

dt 


= bN(t) x [n -N(t)] 


1. with boundary condition: att = s (start of the diffusion process), N(s) = N s . 

2. where parameters b, n and s, indicated in bold, are to be determined. 

N(t) is the cumulative number of adopters of the innovation at time t 


dN(t) 

dt 


is the rate of diffusion of the innovation at time t 


b is the rate of adoption 

bN(t) is the coefficient of diffusion for the internal-influence model (consider it as the prob¬ 
ability of adoption of the innovation at time t) 
n is the ceiling or long-term equilibrium number of potential adopters in the social system 
[n - N(t)] is the number of potential adopters who have not yet adopted the innovation at 
time t (it is the difference between the ceiling number of potential adopters, n, and the 
number of previous adopters, N(t), at time t) 


Figure y. Differential version of internal-influence model with origin and ceiling 
parameters (based on Griliches 195/ passim). 




[mAm]-tum-G. Horseshoe (U.S.) 
[mAm]-tum-G. Horseshoe (Can.) 
[num] -tum-Ottawa Valley 
[mAm]-tum-Quebec City 
[num] -tum-Montreal 


over 80 70-79 60-69 50-59 40-49 30-39 20-29 14-19 

Age of respondent 

Figure 8. Age chart for (mom) - [mAm]-tum ([mAm] which rhymes with the word turn). 


6.2. the unsuccessful innovation : mom. The variable (mom), investigated by a 
closed-ended question, is bivariate with variants ‘ [mAm] -turn and‘[mam]-Tom’: 


Does MOM, as in ‘My Mom’s gone fishing with my Dad’, rhyme with 
□ turn or □ Tom? 


As shown on Figure 8, the variable is changing in all regions. The change pattern is a 
reversal, where the innovation ‘ [mAm] -turn first increases in frequency but is unsuc¬ 
cessful in achieving lasting prevalence at middling levels or higher. 

I have not found any satisfactory mathematical model that logically suits the 
ascent and descent of an unsuccessful innovation. However, if we consider a rever- 
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dNflt) 

dt 


= [a, + bxNflt) - faNfft)] x [th -Nfft)] 


—7— = [a 2 + b 2 N 2 (t) - a 2 Nx(t)] x [n 2 -N 2 (t)] 
at 

i = 1, 2 for variants/innovations l, 2 

1. with boundary conditions: att = sj (start of the diffusion process of the innovation i), 
b>'i( s i) = Nj s . 

2. where parameters aj, bp nj, Sj, a 2 , and j} It indicated in bold or in Greek letters, are to be 
determined. 

Nj(t) is the cumulative number of adopters of the innovation i at time t 


dNft) 

dt 


is the rate of diffusion of the innovation i at time t 


aj is the coefficient of diffusion for the external influences of the innovation i 
b[ is the rate of adoption of the innovation i 

bjNft) is the coefficient of diffusion for the internal influences of the innovation i (consider 
it as the probability of adoption of the innovation i at time t if innovation i could not be 
substituted by the other innovation) 

[aj + bjNj(t)] is the coefficient of diffusion for the mixed-influence model if innovation i 
could not be substituted by the other innovation 

- a 2 N,(t) consider as the probability of non-adoption of innovation 2 at time t if innovation 

2 can be substituted by innovation l 

- Pi N 2 (t) consider as the probability of non-adoption of innovation l at time t if innovation 

l can be substituted by innovation 2 

nj is the ceiling or long-term equilibrium number of potential adopters in the social system 
for innovation i 

[nj - Nj(t)] is the number of potential adopters who have not yet adopted innovation i at 
time t (it is the difference between the ceiling number of potential adopters, nj, and the 
number of previous adopters, Nj(t), at the innovation i at time t) 


Figure 9. Differential version of two-innovation substitute mixed-influence model 
(adapted from Peterson & Mahajan 19/8:213). 


sal as a case of two successive innovations where the first was interrupted, then we 
can use a two-innovation ‘substitute’ model with start and ceiling parameters, as 
exhibited by Figure 9 (adapted from Peterson & Mahajan 1978:213). Each innova¬ 
tion is modeled by its own equation. In ‘substitute’ models, the effects of competing 
innovations are subtracted: hence the negative term in each equation (-a 1 N 2 (t) and 
-ffNfft)) based on the other variant’s frequency. 

Why was the variant ‘ [mAm] -turn’ unsuccessful? Extensive statistical analysis with 
the eight potentially explanatory variables gave no pan-regional correlation. I hypoth¬ 
esize that an influence external to the given society, Canadian English speakers, was 
at work, namely American English. In general, external influences, complements 
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to internal influences, are change agents outside the given social system. They are 
so-called vertical channels of communication, often structured, hierarchical, or 
formal, such as government and its agencies or mass media. To include the potential 
effects of external factors, a mixed-influence S-curve model is created by adding the 
parameter a to the internal-influence model. Parameter a is not related to the number 
of past adopters of the language innovation. It is assumed that the effects of the exter¬ 
nal and internal influences are simply additive. 

7. conclusions. The internal-influence logistic distribution is a descriptive model of 
S-curve language change, and it comes with many assumptions and predictions, only 
some of which were discussed here. It describes the number of people who adopt 
a language innovation as solely a function of time, and incorporates the effects of 
potential social factors as a parameter to be determined. When we make small altera¬ 
tions, the mathematical model no longer captures just S-curve change: we can accom¬ 
modate some language change patterns hitherto considered exceptions, for example, 
cases involving stable variation, reversal, and external influences. MacKeracher (2001) 
discusses more instances such as trivariant variables and differing sub-populations. 
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the difficulties and limitations of employing speech errors as linguistic evi¬ 
dence have been discussed in the literature (e.g.. Cutler 1982). Researchers have 
pointed out the problems associated with the use of speech errors as data. If similar 
distributional patterns are attested in different collections, then the corpus itself is 
criticized on the grounds that people (error collectors) tend to ‘recognize’ certain 
error types more often than others. On the other hand, if different patterns show up, 
then it is claimed that the data are not reliable because there is too much idiosyn¬ 
crasy involved. Although error data are open to criticism, they are, however, a fruitful 
source of information about language. 

In this study, the speech errors of Japanese will be compared with those of Eng¬ 
lish. The similarities and differences observed for the two languages indicate that 
speech errors can reveal many interesting aspects of language and are indeed valid 
as linguistic evidence. 

We first will review some of the problems related to the use of speech errors as 
linguistic data. 

1. collecting speech error data. The first serious attempt to study speech errors 
was made by Meringer and Mayer (1895). Since then, speech errors have been used in 
support of various linguistic and psycholinguistic arguments. They have in particular 
played a major role in the development of models of speech. 

There are two ways of collecting speech errors: the ‘pen-and-paper’ method and 
the experimental method. We review both methods together with some of the prob¬ 
lems related to each procedure. We draw heavily on Cutler 1982 and Poulisse 1999. 

1.1. the pen-and-paper method. Initially, most collections were gathered by writ¬ 
ing down the slips of the tongue actually attested in conversation. This is referred 
to as the ‘pen-and-paper’ method. This method has become the target of criticism 
concerning its reliability. Cutler (1982:6) discusses the three major problems associ¬ 
ated with speech error research of this kind. First, it is not possible for collectors to 
record all slips occurring in a given period of time or a given number of utterances. It 
is extremely difficult for collectors to listen for errors and to pay attention to the con¬ 
tent of the conversation at the same time. Second, some kinds of errors are harder to 
detect than others. This problem of perceptual bias may result in the latter being ‘over- 
represented’ in the corpus. Third, there is the danger of the collection being biased by 
the distributional characteristics of a language. It is of no theoretical interest if some 
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types of errors are reported more often than others if there was a distributional dif¬ 
ference to begin with. 

1.2. experimental method. In an attempt to overcome some of the limitations and 
problems of collecting spontaneous speech errors, experiments have been conducted 
in order to elicit errors in highly controlled contexts. Many techniques have been 
used to elicit errors in experiments. One of the most widely used techniques is the 
slip (Spoonerisms of Laboratory-Induced Predisposition) procedure developed by 
Motley et al. The advantage of experimentally-induced errors is that they are not open 
to the criticisms that spontaneous errors face. Since these experiments are designed 
to elicit errors in highly controlled contexts, researchers need not worry about the 
detectability problem or the distributional bias. However, there are certain disadvan¬ 
tages to the experimental method. First, the validity of the data becomes questionable 
when we consider the highly artificial tasks the informants are required to perform. 
The use of tasks that require more difficult articulation than spontaneous speech may 
lead to an alteration in normal planning processes. The second disadvantage of the 
experimental data concerns the type of errors that can be elicited. Since experiments 
can yield only certain types of errors, it is not possible to capture error patterns as a 
whole. We need assumptions and hypotheses in order to conduct the experiments. 
The experiments themselves cannot provide these. Here again, just as in the case of 
the pen-and-paper method, we find that there are problems. 

1.3. speech errors: valid or invalid? We see above that there are problems related 
to both the pen-and-paper method and the experimental method. This being the case, 
then, the question is the validity of the use of speech errors as linguistic evidence. Are 
the problems and limitations discussed an indication that speech lapses are not reli¬ 
able as linguistic data? 

It turns out that speech errors can tell us some interesting things about language. We 
just need to recognize the problems associated with speech error research and also to 
take into consideration the advantages and disadvantages of the two research methods. 
An advantage of the experimental procedure is that it is suitable for testing hypotheses 
in a direct manner. However, it is the errors collected from spontaneous speech that 
allow researchers to raise the hypotheses to be tested in experiments. Therefore, if the 
pen-and-paper method is conducted with the goal of raising hypotheses about lan¬ 
guage storage and processing, then there is no need to worry about detectability and 
distributional bias. This is the position taken in this paper. By focusing on the errors 
from this standpoint, we provide hypotheses worthy of testing in future studies. 

In the following section, we compare the speech errors of English and Japanese 
speakers. The basis of this study is a collection of 298 Japanese errors recorded mostly 
by the author from spontaneous speech. The English data come from the literature 
(e.g., Cutler 1982, Fromkin 1973, Laubstein 1987). We attempt to find out whether 
the generalizations claimed for speech errors in English also apply to Japanese. 
Of particular interest is the interaction between the speech errors and the syllable 
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structure of a language. We limit ourselves to two hypotheses related to syllable struc¬ 
ture that were put forth on the basis of slip research: the syllable structure hypothesis 
and the syllabic similarity hypothesis. 

2. two hypotheses. In this section, we review the basic assumptions of the two hypoth¬ 
eses related to syllable structure. We begin with the syllable structure hypothesis. 

2.1. the syllable structure hypothesis. The basic assumption of the syllable 
structure hypothesis is that syllable initial consonant errors occur more frequently 
than errors in other positions because of the internal subgrouping of elements within 
the syllable: the initial consonant(s) independently constitute the onset of a syllable 
whereas the vowel is subgrouped into a rhyme together with the final consonant or 
consonant cluster: 

(l) a 



nucleus coda 


C V C 

This claim regarding the ‘initialness effect’ was first made by MacKay 1970, who 
reported that 96% of ‘within word’ exchanges, and 81% of the ‘between word’ 
exchanges involved syllable initial elements. A similar percentage was also reported 
by Taubstein 1987, whose study of English speech errors showed that of 559 speech 
lapses, 63% involved the substitution of onsets by onsets, thus supporting the syllable 
structure hypothesis. 

2.2. the syllable similarity hypothesis. The syllabic similarity hypothesis or syl¬ 
lable position constraint is based on a claim made in Boomer and Taver 1968. It posits 
that phonemes in initial syllabic position replace those in initial position, nuclear 
replace nuclear, and final replace final. Simply put, this hypothesis accounts for the 
interaction of consonants with consonants only, vowels with vowels only. 

The claim that phonological units involved in errors retain their original posi¬ 
tion in the syllable has been supported in many studies of English speech errors. For 
example, MacKay 1970 reported that reversed consonants occurred in the same syl¬ 
labic position 98% of the time, and in the case of reversed vowels, 81% originated in 
the same syllabic position. Laubstein 1987 also reports that errors involving interac¬ 
tion between onset-onset, coda-coda, peak-peak make up 88% of the total. Details are 
given in Table 1 (overleaf). 

Lately, however, some researchers have questioned the validity of this hypothesis 
(e.g., Meijer 1997). These researchers argue that the evidence presented so far does not 
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Syllabic position 

Frequency (Number) 

Frequency (Percentage) 

onset-onset 

352 

63% 

coda-coda 

59 

11% 

peak-peak 

81 

14% 


Table 1. Syllabic position of segments interacting in errors (based on Laubstein 198 7: 
345 )- 


Group 

Age 

Number of errors collected 

A 

3-6 years old 

84 

B 

10-16 years old 

26 

C 

20-29 years old 

95 

D 

30-55 years old 

93 


Table 2. The four groups used in the study. 

necessarily support the claim that consonants (or vowels) are bound to a particular 
position in the syllable. Rather than ‘syllabic’ similarity, they claim that the reason why 
consonants and vowels do not interact with one another in speech errors is due to 
their ‘phonetic’ similarity. They point out that the substitution of a vowel by a conso¬ 
nant might result in an unpronounceable string and should therefore be rare. 

3. data analysis: English vs. Japanese. In this section, we analyze Japanese speech 
errors by comparing them with English speech errors. Our main focus is on the two 
hypotheses given in section 2. We studied the four groups shown in Table 2. 

3.1. THE SYLLABLE STRUCTURE HYPOTHESIS AND JAPANESE ERRORS. As we See above, 

the basic assumption of the syllable structure hypothesis is that syllable initial errors 
are more frequent than syllable final errors because final consonants form a subgroup 
(rhyme) with the preceding vowels. Proponents of this hypothesis have taken for 
granted the fact that errors occur often in onsets due to syllable structure. This is a 
matter of course when we consider the ‘Germanic’ type of syllable structure as in (1), 
repeated on the next page as (2). 

Here a break exists between the onset and the nucleus; the nucleus forms a con¬ 
stituent, the rhyme, with the following consonantal element. The frequency of syl¬ 
lable initial errors is attributed to this break between the onset and the rhyme, thus 
supporting the syllable structure hypothesis. However, in Japanese, syllable structure 
differs. In this language, there is cohesiveness between the onset and the nucleus, as 
shown in (3): 
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Error type 

Number of errors 

a. syllable initial consonant errors 

162 (54.4%) 

b. errors involving the nucleus 

60 (20.1%) 

c. errors involving <onset + nucleus> 

22 (7.4%) 

d. errors involving onset or <onset + nucleus> 

23 ( 7 - 7 %) 

e. errors involving moraic phonemes 

31 (10.4%) 


Table 3. The occurrence rate of each error type. 

(2) a (3) a 



nucleus coda C V moraic phoneme 

C V C 

In Japanese, the onset and the nucleus are attached to a single mora node. If the 
‘initialness effect’ is due to syllable structure, then it can be postulated that the occur¬ 
rence rate of errors involving just the onset would be low in Japanese. 

In Table 3, the occurrence rate of each error type is given. Here we find that nearly 
55% of the errors in Japanese involve just the onset. If we consider the internal syl¬ 
lable structure in (3), we would expect more errors of the (c) type (errors involving 
<onset + nucleus>) to occur. However, these types constitute only 7.4% of the total. 
Even if type (d) errors (errors that can be interpreted either as onset only or as 
<onset + nucleus>) were added, the occurrence rate comes to only 15.1%. This high 
percentage of the syllable initial type of errors indicates that irrespective of syllable 
structure, the onset has a special status in inducing errors. This implies that the fre¬ 
quency of syllable initial errors has little to do with the internal structure of syllables, 
thus denying the basic assumption of the syllable structure hypothesis. 

3.2. syllabic similarity hypothesis. The syllabic similarity hypothesis, which 
claims that phonological units involved in errors retain their original position in the 
syllable, has been supported in many studies of English. Examples from English are 
given in (4): 

(4) C —> C left hemisphere —> heft lemisphere 

pots and pans —> pons and pats 

V —> V Wang’s bibliography —> Wing’s babliography 
Bev and Bill —> Biv and Bell 

(Laubstein 1987:343,349,350) 











312 


HARUKO MIYAKODA 


In Japanese, the general pattern attested is the same as for English: a consonant 
replaces a consonant, a vowel replaces a vowel. Furthermore, the CV syllable (or 
mora) is exchanged with another CV. Examples are given in (5): 

(5) C —»C karini —> tarini ‘if’ 

V —»V ondoku —> ondaku ‘reading out loud’ 

CV —» CV ta ra ko su pa —> ta ra su ko pa ‘cod roe spaghetti 

There are, however, exceptions to this correlation, and this involves the so-called 
moraic phonemes. 

The mora plays a central role in the Japanese language. In practice, the mora often 
overlaps with the syllable, and indeed in many cases moras are syllables. The main 
reason why the mora and not the syllable is assumed to be the basic prosodic unit 
in Japanese is that several elements serve as independent units, although they do not 
qualify as independent syllables. These elements, the moraic phonemes, fall into four 
kinds: the nasal coda (N), the geminate consonant (Q), the second half of long vowels, 
and the second half of diphthongal vowel sequences. Examples of words with moraic 
phonemes are given in (6): 


(6) examples of moraic phonemes 

a. nasal coda 

b. geminate consonant 

c. second half of long vowels 

d. second half of diphthongal 
vowel sequences 


ro N do N ‘London 

pi Q tsu ba a gu ‘Pittsburg’ 
ro o ma ‘Rome’ 

ha wa i ‘Hawaii’ 


The nasal coda represents the sounds [n], [m], [rj], depending on the following pho¬ 
netic context (e.g., paN to [panto] ‘bread and’, paN mo [pammo] ‘bread also’, paN ga 
[paqga] ‘the bread’). The geminate consonant represents a doubling of the consonant 
that appears in the onset of the following syllable (e.g., kiQ sa [kissa] ‘tea room, kiQ te 
[kitte] ‘postage stamp’, kiQ pu [kippu] ‘ticket’). These moraic phonemes, although not 
capable of forming an independent syllable, serve as an independent timing unit just 
as the regular CV one-mora syllables do. 

A close observation of Japanese speech errors shows that in the case of moraic pho¬ 
nemes, consonants can be replaced by vowels, and vice versa. Examples are given in (7): 


( 7 ) 


V —^ C keiza i te ki 
sa N sai 

C —> V raNki Ngu 
yu zu Q te 


—> ke N za N te ki 

—> saN saN 

—> ra i ki N gu 

—> yu zu u te 


‘economical’ 
‘three-year-old’ 
‘ranking’ 

‘to hand over’ 


The examples in (7) indicate that interaction between a consonant and a vowel can 
take place as long as they interact within their ‘own kind’, that is, the interaction is 
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either between a moraic nasal or obstruent with the second half of long vowels or 
diphthongs. The examples involving moraic phonemes support the syllabic similarity 
hypothesis. If, as some researchers claim, it is the ‘phonetic’ similarity that is crucial, 
we would not expect such an interaction to occur. What distinguishes the moraic 
phonemes from the other phonemes is their position within the syllable: moraic pho¬ 
nemes can only occupy coda position (cf. (3)). 

4. discussion. In section 3 we analyzed the Japanese data relative to two hypotheses. 
Our findings suggest that 1) the ‘initialness effect’ is not due to syllable structure; in 
both English and Japanese, languages that have different syllable structure, there is a 
strong tendency for errors to occur in onsets; 2) although data from English does not 
strongly support the syllabic similarity hypothesis, the behavior of moraic phonemes 
in Japanese implies that phonological units involved in errors retain their original 
position in the syllable. These two findings based on Japanese speech errors show that 
speech errors retain their original syllable position, the exception being the onset. 

At this point, it might be argued that the Japanese syllable structure given in (3) 
is wrong, and that Japanese, together with the ‘Germanic’ type of languages, has the 
syllable structure in (8) (Abe 1987:6): 

( 8 ) a 



C (G) V V [moraic nasal, first half of geminates] 

The structure in (8) is actually the one adopted by Abe 1987. The internal structure of 
the syllable proposed by Abe corresponds to the‘Germanic’ syllable type. The advantage 
of this approach is that it can account for the independent involvement of the onset 
independently in errors in both English and Japanese: the break between the onset and 
the rhyme accounts for the ‘initialness effect’. Elowever, the problem with this structure 
is that it in no way reflects the linguistic realities observed in Japanese. First, if we look 
at the blending process and blending errors of Japanese (Kubozono & Ota 1998:32), we 
find that the break after the CV is crucial. Examples are given in (9): 


( 9 ) 


a. 

b. 


blending 0 + (si)Qpo —> 

cf. English: sm(oke) + (f)og —> 

blending errors do. (0. si. te) + (na). N. de —> 
cf. English error: cl (ose) + (n) ear —> 


oQpo ‘tail’ 
smog 

do. N. de ‘why’ 
clear 


Second, in the case of stuttering as well, the CV as a constituent is repeated in Japa¬ 
nese as shown in (10): 
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(10) stuttering 

sa - sa - sa - sakana ‘fish’ 

to - to - to - tombo ‘dragonfly’ 

cf. English: s - s - s - six, t - t - t - ten 

The syllable structure depicted in (8) clearly does not reflect what is actually attested, 
whereas the structure in (3) does. Therefore, it seems reasonable to assume that in 
the case of Japanese, the onset and the nucleus form a subgroup, and that irrespec¬ 
tive of syllable structure, the onset has a special status in inducing errors. This claim 
is worthy of exploring further in future studies, especially using the experimental 
method. Of particular interest would be to test to see whether the ‘initialness effect’ 
is concerned with syllable-initial position, as reported by Del Viso 1987 (in Poulisse 
1999:12), or, with word-initial position, as claimed by other researchers (e.g., Davis 
1989, Shattuck-Hufnagel 1992). 

5. conclusion In this paper, we have analyzed the speech errors of Japanese based 
on two hypotheses regarding syllable structure that were posited in earlier studies: 
the syllable structure hypothesis and the syllabic similarity hypothesis. Our findings 
suggest that phonological units involved in errors retain their original position in 
the syllable, thus supporting the syllabic similarity hypothesis. However, the basic 
assumption of the syllable structure hypothesis that the ‘initialness effect’ is due to 
syllable structure could not be supported. Rather, the Japanese error data suggest that 
irrespective of syllable structure, the onset has a special status in inducing errors. 
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although there has been a considerable body of research in various fields dealing 
with similarities and differences between speech and writing, there is little agreement 
on the salient characteristics of the two modes. The written language or discourse is in 
general considered structurally more elaborated, complex and formal, whereas spoken 
language or discourse is characterized as being structurally simple and informal (Chafe 
1985, Gumperz et al. 1984). On the other hand, while spoken discourse has a higher 
degree of interaction between the speaker and the audience, written discourse seems 
more apt to show the writer’s detachment from the audience. Nevertheless, some studies 
have found little difference between spoken and written discourse, and still others argue 
that speech is more elaborated and complex than writing (Halliday 1987). 

Are oral and written languages different? The general consensus is that they are, 
because they represent different ways of communicating and offer different ways of 
knowing and of reflecting on experience. They serve distinct functions and purposes 
in a discourse community, in time and space. They utilize different contexts. While 
oral language is typically associated with conversation that is produced and processed 
in the context of face-to-face exchange, written language is typically associated with 
the language of books and explanatory prose. Oral language is characterized as infor¬ 
mal, interpersonal, and narrative-like with prosodic cues, deixis and paralinguistic 
devices readily available, while written language is considered formal, planned and 
expository-like with limited reciprocity between the writer and the reader (Elorowitz & 
Samuels 1987). Nevertheless, there is much variation and overlap in this simple 
dichotomy between oral and written texts, depending upon the purposes for which 
they are used and the audience they serve. 

Furthermore, cognitive psychology has offered us much evidence that spoken and 
written language involve different mental processes. There is an abundance of research 
in the field that addresses the processing of speech and writing and the short-term 
consequences of exposure to oral versus written discourse. Jahandarie (1999) presents 
a thorough and detailed review of various issues and empirical evidence concerning 
the comprehension of oral and written words and sentences, the retention of these ele¬ 
ments, and both comprehension and retention of connected discourse. This large body 
of evidence demonstrates that major differences exist between spoken and written lan¬ 
guage, as shown in the distinct ways our mind deals with each language. It is obvious 
that the majority of evidence from separate lines of research indicates the existence of 
the differences between spoken and written discourse. Nevertheless, the comparison 
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between the two modalities is usually either done between the typical speech and writ¬ 
ing (e.g., conversation vs. explanatory or expository prose), or between two different 
genres (e.g., narrative vs. academic writing). While distinct patterns have emerged from 
these comparisons between oral and written discourse, the characterizations may be 
partly due to genre differences, because patterns of discourse such as rhetoric struc¬ 
tures, attribution, adversative, covariance, response, etc., do not work in the same way 
across readers of various age groups and grades and across text topics (Horowitz & 
Samuels 1987). It would be more interesting to examine oral and written language of 
the same genre and the same style to discover differences and similarities between the 
two modalities where other variables of discourse are kept constant. The present study 
attempts to explore contrasts between oral and written language from a much narrower 
perspective: a comparison between the two modalities of the same type of discourse— 
narrative accounts of the same video clip that are produced under the same controlled 
conditions by subjects of a homogeneous background. This limited and controlled 
comparison would help us better understand differences as well as similarities between 
oral and written language in a stricter sense, because other apparent differences such as 
audience, purpose, content, discourse context, discourse structure, etc., are kept mini¬ 
mal in oral and written narratives. 

1. the narrative study. A narrative study was designed and conducted to elicit 
both oral and written stories from native English and Mandarin Chinese speakers. 
The purpose of the study is to investigate how speakers construct and tell a story in 
either oral or written form after watching the same stimulus material, a short video¬ 
clip, and what systematic similarities and differences occur in both narratives. The 
study included participants of two languages so as to explore a cross-linguistic con¬ 
trast between the two types of narratives. 

1.1. stimulus material. The stimulus material consists of a 4-minute video-clip enti¬ 
tled‘The New Doorbell’. The clip is a cartoon about a man who installs a new doorbell 
in his apartment and then waits anxiously for people to ring it. The cartoon is a silent 
color movie with background music; no written language ever appears on the screen 
except the title, which is shown at the beginning of the video-clip in both Chinese 
and English. As in any other typical story,‘The New Doorbell’ consists of a beginning 
which introduces the main character and the setting and the theme of the story, a 
middle which unfolds the development of the storyline, highlighted by a climax, and 
an end which brings the outcome of the story. The overall structure of the story is 
schematized in Figure 1. 

The story is both linearly and hierarchically structured. We wanted to see how 
speakers would organize the story information, what information they would con¬ 
sider important, less important or marginal, how they would encode the hierarchical 
structure of the story into a linear text, and what linguistic devices they use to achieve 
the story coherence. Furthermore, we were particularly interested in subjects’ perfor¬ 
mance between oral and written task across the two languages. 
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The New Doorbell 
_ ___ 1 - ■ -- 

Beginning 

1 

1 

Middle 

1 

End 

1 

1 

Setting and Theme 

1 

Development and Climax 

1 

Outcome 

Figure 1. ‘The New Doorbell’ structure. 



1.2. methods and procedures. The narratives were produced either individually or 
in groups. Before the story-telling began, a subject or a group of subjects was given 
a passage of written instruction on what they were going to do. The subjects were 
asked to first watch a video-clip of four minutes length, and then describe orally or 
in written form what happened in the story. The written instruction did not describe 
the nature of the video-clip or the purpose of the study, nor did it mention the word 
‘story’ or ‘story-telling’. We hoped that the subjects were able to perceive, construct 
and describe the video-clip with as few preconceived notions as possible. The oral 
narratives were tape-recorded and later transcribed; the written narratives were col¬ 
lected immediately after the subjects finished writing the story. 

1.3. subjects. Seventy subjects participated in the narrative study. Forty were native 
American English speakers from the University of Maine at Farmington, and thirty 
were native Mandarin Chinese speakers from the Central China University of Finance 
and Economics. The subjects were randomly assigned to two groups in each language: 
20 in English Oral (EO), 20 in English Written (EW), 15 in Chinese Oral (CO) and 
15 in Chinese written (CW). All subjects were undergraduates and about two-thirds 
were women. 

2. results and DISCUSSION. In general, the speakers of both tasks across the lan¬ 
guages produced the narratives in comparable ways in terms of episode selection and 
description, coherence-building, event sequencing, reference tracking, and inference 
making. They did not merely represent the visual data, such as the movements of the 
objects or persons, but rather, an interpretation of the events (Loftus 1979). In other 
words, they construct a meaning. While describing what happened in the video-clip, 
they were more concerned about the cause and outcome of the actions and the pur¬ 
pose and explanation of the characters who performed the actions. They stayed on 
the main event line, focused on the actions of the central character, and paid less 
attention to information that was less important in the development of the storyline. 
Although some subjects only touched upon or even omitted some scenes which were 
not critically related to the theme of the story, all narratives exhibited the global 
structure of story construction: a beginning consisting of background information 
such as the setting and the main character, a middle showing the main storyline 
and the development of the story, a climax dealing with the conflict between the man 
(the main character) and a postman, and an ending describing the disappointment 
of the man. This global organization of the narrative discourse demonstrates that 
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subjects, or rather, people in general, have expectations about how to tell a story. They 
take in visual stimuli, construct a mental representation of what they perceive, and 
encode it into a linguistically structured message. Even though the written instruction 
does not ask subjects to ‘tell a story’, they nonetheless uniformly organized the visual 
data into a hierarchical structure with interrelated events, describing who, when, 
what, where, what happened, and why in their narrative, whether oral or written. 

While the overall structure of the narrative across the four groups is quite similar, 
striking differences are found between the two modalities. The remaining sections of 
the paper focus on the patterns of differences that emerged at the word/phrase and 
clause/sentence level between the two modalities. 

2.1. at word/phrase level. The first major difference at this level is the use of verb 
tense between English oral and written narratives. Even though past tense is consid¬ 
ered as norm in narratives (Biber 1988), the majority of subjects (15 out of 20) in the 
writing group employed present tense, as compared to only 8 subjects using the tense 
in the oral group. It seems that the tense preference is different between speakers and 
writers. The use of present and past tense is found to be associated with the modality 
(y 2 = 6.46*, p<o.oi). The tense difference might be due to the nature of oral versus 
written story-telling. Thirteen out of twenty subjects in EO started their narrative 
with a comment that they just watched a video/movie clip, and this very first sen¬ 
tence sets up a time frame for their subsequent story-telling, i.e., what happened in 
the video. In EW, however, very few of the subjects mentioned the fact that they were 
about to tell a story of a video clip they just watched; sixteen out of twenty subjects 
started the narrative directly with the central character of the story, e.g., A man... or 
There is a man..., and then continued almost always with the present tense descrip¬ 
tion throughout the written narrative. The typical beginning of the oral and written 
English narratives is presented in (1) and (2) below. 

(1) eo: sio We watched a video... called the new doorbell. It was about a man... 

(2) ew : S5 There is a man who lives in... 

It seems as if speakers are recounting past events but writers are describing what is 
happening at the moment. Although Chinese does not have overt markers for verb 
tense, subjects did begin their narratives in the two modalities differentially to a cer¬ 
tain degree. Seven out of fifteen subjects in CO started their narrative with the men¬ 
tion of the cartoon/video, suggesting that they were telling the story of a cartoon or 
video they watched previously. Only one subject did so in CW; the other fourteen 
opened their narratives, as did their English counterparts in EW, with some kind of 
introduction of the central character, then continued to report the events in their 
temporal sequence. For example: 
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(3) co: S9 umm, zhe-ge dong-hua-pian de ming-zi jiao ‘xin zhuang de men-ling’... 

cartoon of name call ‘new-install doorbell’ 

‘Umm, the title of the cartoon is “The New Doorbell”.. 

(4) cw: S 2 yi-ge nan-ren zhu-zai yi-zhuang gong-yu li,... 

a man live a apartment in 

‘A man lives in an apartment,...’ 

Another difference found at the word/phrase level between the oral and written nar¬ 
ratives is the occurrence of hedges and hedge-like expressions such as kind of/kinda, 
sort of, seem like, anyway, etc, which are used almost exclusively by the oral group in 
both languages. For example, 

(5) eo: S3 So he shut the door and the delivery man kinda looked at the doorbell, 

and then walked away. 

(6) eo: sy Then... a short little man came and knocked on the door,... and this 

made him kinda angry. 

(7) co: S5 ta haoxiang shi you-dianr nao-huo, shi-ba? 

he seem-like is have-a little frustrated isn’t he 

‘He seems a little frustrated, doesn’t he?’ 

(8) co:si2ta ke-neng you-dianr shl-wang ba,... 

he maybe somewhat disappointed 

‘He is kind of disappointed.’ 

Such expressions suggest that speakers were ambiguous about the use of adjectives 
like angry, frustrated, disappointed, upset, etc. Chafe and Danielewicz (1987) attribute 
the hedge use to the limited lexical choice on the part of the speaker, who is not com¬ 
pletely satisfied with his/her lexical choice, yet has no time to ponder on a better word 
because speaking is done on the fly. They argue that speaking calls for greater expen¬ 
diture of cognitive effort and hence, speakers tend to operate with a narrower range 
of lexical choices than writers. As a result, the vocabulary of spoken language is more 
limited in variety, regardless of the kind of speaking involved. While the argument 
is intuitively appealing, it is not evidenced in our study. Although writers of both 
languages never used hedges such as those shown in the above samples, their choice 
of vocabulary is very much the same as those of the speakers, even if they ‘have the 
leisure to dip into the rich storehouse of literary vocabulary, search for items that will 
capture nuance’ (Chafe & Danielewicz 1987:88). For example, when describing the 
man’s disappointment, subjects across the four groups used the same set of adjectives, 
such as upset, angry, frustrated, disappointed, disheartened, sad, and mad, sometimes 
accompanied by intensifies or quantifiers such as very, extremely, a little, more, and 



322 


MING-MING PU & QING-HONG PU 


so. It seems, at least for the genre of story-telling, the range and level of vocabulary 
are quite similar between the two modalities, which is in general simple and of high 
frequency. 

Why, then, do speakers seem less certain of their lexical choice than writers, given 
the fact that they use practically the same set of words? On the one hand, we agree 
that the fundamental difference lies in the inherent cognitive constraints of speaking 
and writing, as pointed out by Chafe and Danielewicz: speakers have little time while 
writers are not pressed to do on-line production. We argue, on the other hand, that 
what our speakers hesitated about but had no time for is not so much the availability 
of vocabulary as the verification of what they actually saw prior to the story-telling. 
For example, the speaker in (5) was making a simple statement that the postman looks 
at the doorbell button before he leaves. The subject had no trouble with the choice or 
preciseness of the verb ‘look (at)’, but she wasn’t sure, at that moment of the on-going 
production process, whether or not the postman actually looked at the doorbell. She 
couldn’t afford to think more about it, but used a hedge instead to show her uncer¬ 
tainty. Similarly, (6), (7) and (8) above reveal that the speakers knew that the man was 
angry, frustrated, or disappointed but they were not sure of the degree of his anger, 
frustration, or disappointment. In order to maintain the quick and smooth flow of 
production, they had no choice but to use a hedge to mark their state of mind on 
that particular juncture. Of course, speakers can pause, false start, or even comment 
on their uncertainty to revise what has been said, but too much fumbling is harm¬ 
ful to effective communication on the one hand, and acknowledging I’m not sure/I 
don’t know is damaging to the speaker’s credibility on the other. In written narratives, 
on the other hand, the use of hedges and false starts almost never occurred, because 
in writing, with or without editing, one always has more time for language process¬ 
ing. Writers usually plan a clause/sentence ahead before they actually write it down. 
It seems that they are aware, consciously or not, of the permanency and the formality 
of the writing and try to avoid hesitation and uncertainty in their narratives. What 
is written down exists (more permanently), spread out on the page. Informal words 
such as kind of/kinda don’t usually belong to the written form unless one purposely 
tries to mimic the oral language. 

2.2. at clause/sentence level. As discussed in the prior section, speakers did not 
appear to have a more limited set of vocabulary than writers in their production of 
narratives: both employed relatively simple and high-frequency nouns, verbs, adjec¬ 
tives and adverbs. At the clause or sentence level, we also find that both types of narra¬ 
tives are comprised primarily of main clauses, usually simple and short. The similarity 
in clause length and syntax reflects the nature of narratives, be it oral or written, 
which are modeled on the story-telling genre, because narratives depend for their 
effect on interpersonal involvement between the speaker/writer or the character and 
the reader (Tannen 1984). Of all the complex sentences in our data, adverbial clauses 
occur most frequently. Although adverbial clauses have a very similar frequency of 
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Temporal 

MC+SC SC+MC 

E 1 +E 2 E 2 +E 1 E 1 +E 2 E 2 +E 1 

Cause 

MC+SC SC+MC 

Purpose 

MC+SC SC+MC 

Total 

EO 

13 3 10 1 

21 0 

4 0 

52 

EW 

17 n n 1 

n 0 

5 0 

56 

Total 

30 14 21 2 

32 0 

0 

On 

108 


Table 1. English adverbial clause. 

distribution among the four groups, they exhibit distinct patterns between the two 
modalities. 

In general, subjects in both English groups prefer the unmarked structure of main 
clause preceding subordinate clause (Pu & Prideaux i994),i.e.,MC+SC. Nevertheless, 
for complex sentences with temporal adverbial clauses, subjects prefer to describe 
events as they occur in the natural order, namely, E1+E2, even if that results in a 
marked construction of SC+MC. For example, (9) and ( 10 ) below are coded in the 
order of SC+MC while following the temporal sequence of E 1 +E 2 . 

(9) eo: s/ When he pressed it, it played music. 

(10) ew: S4 After he listens to the music of his doorbell, he sets a chair near 

the door... 

Table 1 summarizes the complex sentences containing adverbial clauses found in the 
two English groups. The result indicates that the order of temporal sequence appears to 
override the MC+SC construction frequently, especially for the speakers, who seem 
to be more constrained by the temporal sequence of events in processing the story 
information than writers. 

It makes perfect processing sense that speakers, rather than writers, should rely 
more heavily on the temporal sequence of events during the storytelling, because they 
are more constrained in cognitive resources during narrative production. As several 
researchers have observed, it is easier to encode, store in memory and retrieve a chain 
of events that are narrated in a coherent sequence, and temporal coherence facilitates 
mapping process within units or episodes (Gernsbacher 1990, Givon 1993). Although 
writers are sensitive to the reader’s needs by trying to follow event sequence and code 
them in the unmarked structure (MC+SC), they can nonetheless afford to manipulate 
the sentence structure to a certain degree to serve special functions. Of interest here 
is the fact that in EW, all 12 complex sentences with the marked order of SC+MC are 
found at the beginning of an episode, signaling the episode boundary. Examples (11) 
and (12) below illustrate such sentences marking the advent of a new episode. 

(11) ew: s 8 After the man waited some more, he heard another person come up 

the stairs. 
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(12) ew: sio Every time he hears someone walking up the stairs, he gets excited. 

The use of marked structure to indicate episode boundaries is explored in prior stud¬ 
ies (Carpenter & Just 1975, Gernsbacher 1990, Pu & Prideaux 1994), which show that 
speakers and writers use certain devices to signal for their listeners and readers the 
begining of a new passage or episode, where there is a change in topic, point of view, 
location, or temporal setting. 

While there are different aspects to the use of adverbial clauses by English speakers 
and writers, adverbial clauses found in Chinese spoken and written narratives differ 
in other ways, even though both have the same frequency of occurrence. Unlike their 
English counterparts, most Chinese complex sentences containing adverbial clauses 
have a fixed order of SC+MC. For example, 

(13) cw: s8 sui-ran ta deng-le hen chang shi-jian, dan ta bing-bu qi-nei 

though he wait very long time but he not discouraged 

‘Though he has waited for a long time, he is not discouraged.’ 

(14) co: S3 (yi-ge zhong-nian ren hui-dao jia-zhong.) 

a middle-aged man come home 
Hui-dao jia yi-hou, ta ba wai-tao tuo-le, 

come home after he om jacket take-off 
‘(A middle-aged man comes home.) After (he) comes home, he takes 
off his jacket...’ 

(13) contains a clause of concession that precedes the main clause, and (14) is a com¬ 
plex sentence of SC+MC with the subordinator at the end of the temporal clause. In 
CO narratives, the majority of the adverbial clauses indicate time such as the one with 
yi-hou (‘after’) in (14), whereas CW narratives contain different types of adverbial 
clauses. Table 2 summarizes the results, which shows that adverbial clauses in CO 
are almost exclusively the after -type in the SC+MC construction, clearly reflecting 
iconicity of event sequence. It is not surprising to find nearly 90% of adverbial clauses 
are of the same type in the spoken narrative, because it is easier for speakers with lim¬ 
ited cognitive resources to construct sentences that mirror the temporal sequence of 
events. The construction of other complex sentences must call for the expenditure 
of some cognitive effort, which speakers constantly lack but writers have. Conse¬ 
quently, the written narrative witnesses adverbial clauses of a greater variety. 

Another important difference between oral and written narratives at the clause/ 
sentence level lies in how speakers and writers package and encode events in sentences. 
While constructing narrative orally, subjects tended to put sequences of events in con¬ 
secutive yet separate clauses or sentences, usually one proposition per clause. In written 
narratives, on the other hand, clauses and sentences are more compact, consisting of 
multi-propositions per clause. The contrast is illustrated in the following examples: 
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Group 

after 

Temporal 

before 

other 

Cause 

Purpose 

Concession 

Total 

CO 

52 

3 

4 

0 

0 

1 

60 

CW 

22 

2 

20 

8 

6 

4 

62 

Total 

74 

5 

24 

8 

6 

5 

122 


Table 2. Chinese adverbial clause. 

(15) eo: s6 Next a little girl comes up, and she’s bouncing a ball,... And she 

bounces the ball outside his door, and then leaves the scene. She 
doesn’t ring the doorbell. 

(16) ew: S14 A little girl bouncing a ball approached the door and walked away 

without ringing the doorbell. 

(15) and (16) describe the same sequence of events about the little girl who passes by 
the man’s door without ringing his doorbell. The subject who told the story orally 
coded the episode almost scene by scene in 5 separate clauses; each clause consists 
of only one proposition, whereas the subject who wrote the story encoded the epi¬ 
sode in only two clauses, each of which is comprised of more than one proposition. 
Although writers in general did not employ more subordinations than speakers did, 
their descriptions are nonetheless more compact and more complex in terms of the 
number of propositions per clause and the strategic deployment of present and past 
participials. Speakers in general prefer single-propositional sentences, because it is 
presumably easier to store and retrieve them than multi-propositional sentences. 
Kintsch and Keenan (1973) discovered that reading time increases as a function of the 
number of propositions within a text, and Kintsch and Glass (1974) found that recall 
is better for single proposition sentences than for multiple proposition sentences in 
texts, even when the number of words is constant. 

Moreover, subjects in producing oral narratives are often found to re-encode part 
of the event that has already been depicted in the preceding sentence, revealing 
speakers’ online processing. Samples of the following sort are quite common in Chi¬ 
nese oral narratives: 

(17) eo: S2 yi-ge nan-de ta xia-ban hui-lai, xin zhuang-le vi-ge men-ling . 

a male he off-work return new install a door-bell 

zhudng-shang-le yi-hou, 
install-up after 

A man returns from work, and installs a new doorbell. After (he) 
installs (it),...’ 
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(18) eo: ss ran-hou ting-nan iiao-bu sheng lai-le . jiao-bu sheng Idi-le yi-hou, 

then hear foot-step sound come foot-step sound come after 

‘Then (he) hears footsteps coming up. After the footsteps come up,...’ 

(19) eo: S9 zhi-hou jiii lai-le ge you-di-yuan , you-di-yuan lai-le ne, 

afterwards just come a post-man post-man come 

Afterwards comes a postman. As the postman comes,...’ 

This kind of overlap between events, or rather, repetition of a statement, or partial 
statement, serves as a stalling mechanism in oral language, which enables speakers to 
continue their production with relatively less effort, to find all or part of the utterance 
ready-made, so they can proceed with verbalization before deciding exactly what to 
say next (Tannen 1993). At the same time, repetition provides the listener with redun¬ 
dant, semantically less dense discourse with sufficient pauses for auditory processing. 
Our written narratives, on the other hand, witness very few of such overlaps or rep¬ 
etitions because writers have sufficient time to plan their clauses or sentences before 
actually producing them. 

3. CONCLUSION. Prior studies on spoken and written language in general examine 
oral and written discourse of different varieties such as letters, academic writings, 
newspaper articles, narratives, conversations, and lectures, which are distinct in the 
first place in their structural organization, level of formality, message content, genre 
of text, time and space of production, general audience, and context. With so many 
different facets of discourse involved in the investigation on speaking and writing, the 
results are often convoluted and contradictory, because there are too many uncon¬ 
trolled factors that might have affected or led to the results. The present study 
tries to avoid these dispersive factors in our investigation of similarities and differ¬ 
ences between speaking and writing by examining the two modalities in a narrower 
scope yet a more controlled manner. We asked subjects of homogeneous background 
from two typologically different languages to produce a piece of narrative after they 
watched a video clip. Besides the different medium in which subjects produced the 
narrative, all other aspects of discourse were kept constant: same stimulus material, 
same environment, same goal orientation, same context, same genre, same level of 
formality, same awareness of task (i.e., being recorded either on tape or on paper), 
same preparation time (i.e., immediately after watching the video-clip), and finally, 
no audience present and no time limit placed on production for either mode. With 
all those variables under control, we hoped to be able to compare spoken and written 
language in a stricter sense with cross-linguistic validation. 

The present study argues that both oral and written narratives produced by our 
subjects exhibit patterns of similarities to a certain extent because of the common 
characteristics of the narrative genre, but considerable and significant differences 
manifest themselves between the two modalities because of the distinct mental pro¬ 
cesses involved in oral and written language production. Our production data lend 
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support to our argument. On the one hand, the two types of narratives produced 
under the same condition present important similar patterns. Regardless of modali¬ 
ties, subjects perform uniformly at the discourse level in global structure organiza¬ 
tion, episode building, and reference tracking. However, striking differences are also 
found between the two modalities at various levels of discourse. The differences arise 
basically from the distinct mental processes and mechanisms between speaking and 
writing in general, and varying degrees of inherent dependency of both on interper¬ 
sonal involvement in story-telling. 
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PROSODIC THEORY AND EVIDENCE IN ORAL DISCOURSE 


Steven Schaefer 

Ecole Superieure de Chimie Organique et Minerale, France 


how do we make sense of oral linguistic phenomena? When we look for evidence 
of phonological consistency in contemporary speech, what criteria do we follow? The 
presence of extremely flexible prosodic forms in General American English (gae) is 
problematical for theorizing a phonological structure having a stable relationship to 
speech. Providing evidence of the way prosody is used must invariably come from 
concrete acoustic examples taken from real-life situations, and not merely by simulat¬ 
ing contexts in laboratory conditions. Only then can we build up theories from obser¬ 
vation, much as the description of unwritten languages has traditionally proceeded. 
Speakers of gae typically use several acoustical attributes to emphasize one or more 
words in an utterance. These emphasized words are more prominent and have a com¬ 
municative function, like for pertinent information that has to be made apparent to 
the co-utterer. Speakers emphasize words mainly by changing the fundamental fre¬ 
quency (Lehiste 1970, Ladd 1996),but there are other acoustical correlates to consider, 
like duration and intensity. 

Today most approaches deal with pitch or fundamental frequency exclusively. The 
fundamental frequency can be measured and the intonation curve is often stylized 
and labeled according to the theory one adheres to. In autosegmental theory, this 
is a linearized model of static tones which are said to define the melodic contour. 
Attempts to create an ‘inventory of tunes (Nicaise and Gray 1998:80) which can be 
assigned to simple phrases like Is John coming ? as well as I think John’s coming (both 
could be coded as the tone sequence: Medium-High-LowMedium%, this last symbol 
indicating a tone boundary established by a rise in pitch) illustrate a disregard for the 
complexity of the utterer’s constructing meaning not just in one pre-assigned manner, 
but by his manipulating a number of acoustic parameters in the sound continuum to 
achieve a contextualized, ‘hearer-sensitive’ message. According to the authors, what 
can be (rather arbitrarily) labeled Pre-head, Head, and Nucleus in the British tradi¬ 
tion are assigned target values in the tone string, regardless of individual variation or 
nuance. We find that such a static phonological model does not capture the discursive, 
dynamic aspect of speech. 

The approach proposed here attempts to integrate other ways of looking at sound, 
especially in relation to its connectedness to the speech situations in which it constructs 
meaning. We advocate using such varied methods as discourse analysis, along with an 
acoustic analysis of the stream of speech, which are integrated as an integrated prosodic 
analysis into the utterer-centered model in the Theory of Enunciative Operations for¬ 
mulated by Antoine Culioli in France. The main specificity of this model is the attention 
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paid to the central role of the utterer (or enunciator) in the construction of meaning; 
theorizing the model proceeds from observation of authentic language data collected 
in specific situations which can be shown to influence or constrain the interpretation of 
any given utterance. As viewed in this theory, what is generally called ‘sentence accent’ 
functions as a marker for linguistic operations carried out by the utterer 1 : he uses it to 
draw the attention of his co-utterer to an element of the utterance (here we depart from 
the syntactic term ‘sentence’) and the fact that (for him) this element stands in some 
unique relationship either to himself and/or to some other element included in the dis¬ 
cursive text or the situation of the speech act. Further analysis provides the linguist with 
a more complete understanding of the limitations and necessary character of prosody 
in speech production/perception, which go beyond mere phonological descriptions of 
what sound in speech should correspond to. 

Previous investigations show that there is nonetheless a demonstrable relationship 
between pitch movement and perceived prominent accent in the utterance. The prob¬ 
lem here is to know whether all speakers use pitch movements to emphasize a word 
in a sentence, or if other attributes can be used to emphasize a word. A second prob¬ 
lem is to find the linguistic tools, or schemata, to interpret the relationship between 
prosodic form and meaning. 

We used a perception experiment to detect the prominent words in a number of 
spontaneous utterances in gae. When the majority of the 4 listeners indicate that a 
given word is emphasized by that speaker, we consider that word as prominent. We 
call this a prominent point in the prosodic schema of the utterance. (We should point 
out that the schema is not always coextensive with the utterance. Complex or longer 
utterances can contain a number of schema, and the same schema can theoretically 
extend over more than one short utterance, where an interjection, for example, may 
constitute an utterance.) 

This study investigates 1) whether all prominent accents are produced by a pitch 
movement, e.g. a rise, fall, or peak (both rise and fall); 2) whether the pitch move¬ 
ments in the prominent words can be distinguished from the pitch movements in 
the non-prominent words, but cannot be represented as static points in a linearized 
model; and 3) whether the linguistic meaning of the utterance is reflected in the pro¬ 
sodic marking of pertinent elements. 

1. listening experiment—method. A listening experiment was carried out to inves¬ 
tigate which words in the sentences were perceived as prominent. Once this informa¬ 
tion was available, the pitch movements in these prominent words were investigated. 

The present study isolated a corpus of spontaneous utterances containing the 
initial sequence ‘I think...’ taken from a 45 min. recording of an unscripted televised 
interview,‘The Whitewater Debate’. The relationship between perceived prominence 
accent and the measurement of acoustical pitch movements in spontaneous speech 
is described by means of a listening experiment in which we identified which words 
in 32 utterances pronounced by 4 speakers were perceived as prominent by the 
majority of 4 (other) listeners—a small group of native-speaker informants who 
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Prominence 

RISE 

FALL 

rise/fall 

LEVEL 

TOTAL 

max/schema 

4 

3 

2 

2 


TOTAL/32 SCH 

45 

41 

13 

7 

107 

average/sch 

i-4 

1-3 

•41 

.21 

3-34 


Table 1. Pitch movement in utterance. 

independently distinguished a double-binary ordering scheme of prominent/non- 
prominent and prominent/super-prominent syllables. The pitch contours of these 
words were analyzed in detail using the spectrographic software program Signalyze, 
measuring not only fundamental frequency, but also amplitude and duration on a 
coordinate time axis. Special attention was paid to the type of pitch movement (rise, 
fall or peak) and whether or not other parameters accounted for a lack of perceptibil¬ 
ity in the pitch movement. 

2. results. There appeared to be a characteristic difference between the pitch move¬ 
ment in prominent, super-prominent and non-prominent words. As a result of this 
investigation it can be said that most of the prominent sentence accents are marked 
by rising pitch movement, either on the syllable or as a difference between two succes¬ 
sive syllables. In the prominent words the different types of the pitch movements (fall, 
rise, or peak) were counted. Results of the investigation show that most prominent 
words do have a pitch movement (fall, rise or peak), but are generally accompanied 
by some dynamic rising movement. 

For the 32 utterances in the corpus, Table 1 shows a preponderance of rising pitch 
movement in the prominent points of each prosodic schema containing ‘I Think...’ 
(45 prominent points out of 107 total for the 32 schemas). The first three columns 
indicate respectively the number of prominent points which exhibit rising funda¬ 
mental frequency on the syllable, falling movement (but preceded by a jump up in 
values from the preceding syllable), or a peak (a rise followed by a fall on the syllable. 
The fourth column indicates no movement on the syllable (but following a jump up 
from the preceding syllable). It is worth noting that in this case, the prominent point 
is almost always accompanied by a jump up, as was the case for the syllables with 
falling movement. The one exception was Utterance No. 1. There the perception of a 
prominent point was due to vowel length nearly double that of the average length of 
vowels in the same schema. 

The average of prominent points in utterances involving pitch movement on the 
syllable is 92.5%, involving jumps is 70%, rises 42%, falls (with jumps) 38%, peaks 
(rise-fall) 12%, level (with jumps) 6%. From this we can conclude that pitch movement 
is important in the perception of prominence, but that in any case ascending move¬ 
ment (either of continuous pitch, or discontinuities between two voiced segments) is 
more readily perceptible than descending movement. 
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Utterance 

1 

Pitch movement 

THINK 

Pitch movement 

Transition 

1 

4 

Fall 

1 

Fall 

Jump up 

2 

4 

Fall 

4 

Fall 

Rising 

3 

4 

Fall 

1 

Fall 

Jump up 

4 

l 

Rise 

1 

Fall 

Jump up 

5 

4 

Level 

1 

Fall 

Jump up 

6 

4 

Level 

1 

Fall 

Jump up 

7 

1 

Rise 

4 

Fall 

Falling 

8 

1 

Jump/fall 

4 

Fall 

Falling 

9 

4 

Rise 

1 

Fall 

Jump up 

10 

4 

Rise 

1 

Rise 

Rising 

11 

4 

Level 

1 

Rise-fall 

Rising 

12 

4 

Rise 

1 

Rise 

Jump up 

13 

1 

Jump/rise 

4 

Fall 

Falling 

14 

4 

Rise 

1 

Fall 

Jump up 

15 

4 

Fall 

1 

Level 

Jump up 

16 

4 

Rise 

1 

Fall 

Jump up 

17 

l 

Rise-fall 

4 

Fall 

Falling 

18 

l 

Rise-fall 

4 

Fall 

Falling 

19 

4 

Fall 

1 

Rise-fall 

Rising 

20 

4 

Fall 

1 

Rise 

Jump up 

21 

4 

Fall 

1 

Rise 

Jump up 

22 

4 

Level 

1 

Fall 

Jump up 

23 

2 

Jump/rise 

4 

Fall 

Falling 

24 

4 

Level 

l 

Fall 

Jump up 

25 

4 

Fall 

l 

Fall 

Jump up 

26 

1 

Rise 

4 

Fall 

Falling 

27 

4 

Rise 

1 

Rise-fall 

Rising 

28 

l 

Rise 

4 

Fall 

Rising 

29 

l 

Jump/fall 

4 

Fall 

Falling 

30 

l 

Jump/rise-fall 

4 

Level 

Falling 

31 

1 

Jump/rise 

4 

Fall 

Falling 

32 

4 

Rise 

1 

Rise 

Rising 


Table 2: Pitch movement on 1 think (Prosodic levels are: l/Prominent, 2/Super-promi- 
nent, and <t/Non-prominent). 

For the 32 utterances, Table 2 indicates a preponderance of rising pitch movement 
in the sequence ‘I think’. The column indicating pitch movement for the pronoun T 
is marked in gray for all prominent syllables, which reveals a consistent dependence 
on ascending (rise or jump) pitch movement. We see that if a rise is a necessary 
condition for the perception of prominence, it is not a sufficient condition. This is 
particularly true when a small rising movement is continued during the voiced por¬ 
tion of the following verb, which ‘overshadows’ it. As for the syllable corresponding 
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to the verb ‘think’, the columns indicating pitch movement (and the transition from 
the pronoun) are also marked in gray for all prominent syllables, which again reveal a 
consistent dependence on ascending (rise or jump) pitch movement. 

The marking of prosodic levels on the two initial syllables T and ‘think’ corre¬ 
sponds to specific iconic contours 2 of pitch prominence. In general, it can be said that 
there is a relationship between a pitch movement and the perception of a prominent 
accent. A relatively high frequency step and a clearly perceptible pitch movement are 
indications that people will perceive a given word as prominent. The distributions 
of using rise, fall or peak pitch movements for prominence are not the same. The 
results indicate that most (but not all) of the pitch movements in prominent and non- 
prominent words can be distinguished by the acoustical information of frequency. 

3. discussion. The phonological imprint of the utterer’s subjectivity is directly 
reflected in the choices that one is liable to make in structuring discourse: to position 
oneself in a discussion, identifying with one’s own arguments and opposing them to 
the arguments of one’s ‘co-utterers’, or conversely, by identifying with the arguments 
of others, implicitly co-opting them. We can see this most clearly with the pronomi¬ 
nal referent of the ego (‘I’), which for Benveniste (1971:244), belongs in turn to all 
who participate in discussion; this is of even greater interest when combined with the 
verbal form ‘think’, expressing the relation of the speaking subject to his dictum. 

3.1. examples from the corpus. Four examples of prosodic configurations taken 
from our mini-corpus will illustrate the functioning of‘I think’ in differing discur¬ 
sive strategies. We will first consider the possible nuance of meaning exemplified in 
an utterance where neither ‘I’ nor ‘think’ is marked by a prominent point; second, an 
utterance with a prominent point corresponding to the verb ‘think’; third, an utter¬ 
ance with a (super-)prominent point corresponding to the personal pronoun ‘I’; then 
finally an utterance with both words corresponding to prominent points in the pro¬ 
sodic schema. The general theme treated by all participants is whether or not there is 
any evidence of wrongdoing on the part of President Clinton or his staff in their treat¬ 
ment of the Whitewater affair. (In this and following transcriptions a single slash, /, 
marks a brief pause with no pitch movement, and a double slash, //, marks the end of 
a prosodic schema with falling intonation.) 

(1) MK: The opening statements were more forceful/ There seem, appear to be 
numerous discrepancies on the part of the various players/ so there 
could be something there/ 

CR: Like what ? 

MK: at least of an unethical nature // Well, to begin with, Roger Altman, is, 
is clearly on the hot seat // um, beginning to be on the hot seat/ and 
he’ll be before the Senate committee next week // He testified to mini¬ 
mal contacts between Treasury/ and Resolution Trust Corporation 
people uh/and the White House when he appeared before the Senate 
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last February 24th, I believe it was. Since then, we’ve had a whole, series 
of revelations that it was / these contacts were much more extensive/ 
and I think were beginning to see a pattern / of why it was important 
/or why it may have been important for, people in the White House to 
try to keep a lid on the RTC investigation. (Utterance No. 2) 

In (1), the absence of prosodic prominence on the initial sequence transmits their 
meaning at face value, with the speaker designating himself simply as the origin of the 
expression which follows, and avoiding any nuance or contrast. Indeed, the economy 
of effort in articulation draws the attention to the substance of the message, to the 
detriment of the messenger. This is hardly surprising at the start of the discussion, as 
no other speaker has at this point has offered a diverging analysis of the situation. 

(2) CB: But the point is a lot of this wasn’t raised- why wasn’t this, all this 

raised in the presidential campaign ? 

CR: In the campaign, in the coverage of the 1992 campaign. 

MK: Oh, it was. 

CR: There was some criticism, in the fact //1 mean The New York Times 
did raise the issue of/ what was going on in Arkansas/ and, and other 
publications, and I can’t speak for Time and Newsweek/ 

JK:Yeah, but in the, in the densest, most... 

CR: And it kind of, but really didn’t... 

JK: .. .incomprehensible way; we never got to the, to the $100,000 in this... 

MK: It IS dense// This sort of affair is somewhat... 

JK: ... in the commodities business. I think that/ Carl’s right // Every politi¬ 
cian is hiding something now/ and it is, if you could sum it up in a 
word, it’s their humanity. (Utterance No. 12) 

In (2), the utterer concedes a point to one of his fellows, which he takes the liberty 
to elaborate on; the prominent point on ‘think’ foregrounds the vulnerability of his 
adherence to this argument. We might contrast this with a super-prominent point 
here, which would gloss as ‘I say he may be right, yet I have good reason to doubt it’. 

(3) CR: Let me go to, to John. Your paper has been very very critical/ and 

your editorial page has been enormously critical /of the Clinton White 
House. What do you think so far// 

JF: I think this whole process/ really is designed to /downplay the whole 

scandal/ whatever is THERE / and if you LOOK at the Congressional 
hearings/this is the first Congressional hearing we’ve ever had/ where 
Congressmen actually want to put people to sleep/ and want the cam¬ 
eras to go away// (Utterance No. 7) 
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In (3), the participant seeks, by marking the pronoun with an acoustic prominence, to 
contrast his opinion with those of the others who have already spoken. This conscious 
choice establishes his argument as the most accusatory, and possibly the truest. It does 
not, however, prevent others from rallying to it, but on the contrary seeks to convince. 

(4) CR: Joe, what’s, what do you expect to come out of this that might be dam¬ 
aging to the Clinton presidency? 

JK: Embarassing/ people changing their stories, Treasury officials fighting 
with each other /that sort of thing// But, I disagree with Michael just a 
little bit//1,1 think that, there really isn’t all that much there , there// uh 
I.. A think that there’s not going to be any evidence that an investiga¬ 

tion was impeded //1 think that there is going to be some evidence that 
there were meetings where people from the,the Treasury Department 
told the White House that there was a, an investigation, in progress, but 
you know, it’s kind of ironic. (Utterance No. 4) 

In (4), the prosodic form adopted by the utterer is modulated by a pause (and repeti¬ 
tion of the personal pronoun) between the two elements, which renders them as two 
successive prominent points. Even without the repetition of the pronoun, a pause (or 
unstressed element) seems necessary in order to realize two successive prominent 
points. This configuration differs from the first in that the express identification both 
of the utterer as the source of the dictum and the nature of the relationship between 
utterer and thought content are made explicit. Not choosing to highlight this fact for 
the benefit of the co-utterer suggests that these factors are either clear for the latter, 
non-pertinent for the utterer, or both. The nature of this utterance tends to convince 
not only by the prosodic choice made by the utterer, but also by his insistence on the 
lack of evidence (prominent point on any). 

Although other configurations are theoretically possible for the couplet, our 
corpus only covers these four common cases. The analysis of many examples 
reveals the working of two inter-dependent discursive principles: taking the spotlight 
on the one hand, which involves an insistence on the primacy of one’s own point of 
view (emphasis on T) on the level of the enunciative exchange (turn-taking in polite 
circles), and a modalizing of one’s endorsement of one’s own ‘thought content’ dictum 
on the other. A weakening of relative prominence in either case results in a weakened 
position, to the opinion of one’s co-utterers, or in relation to one’s own opinion (to the 
point of calling it into question). 

3.2 theory. The Theory of Enunciative Operations propounded by Antoine Culioli 
takes the speech event as a starting point, with the concept of the Speaking Subject or 
utterer and the Moment of Enunciation (Uttering) as the center of all phases in the 
construction of an utterance 3 . We find three values for locating operations in relation 
to the utterer or Enunciating Subject (S 0 ): identification with S 0 (1st person pronoun); 
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difference in relation to S 0 (for the 2nd person); and non-relatedness in relation to S 0 
(for the 3rd person). 

Prosody, as an integral part of the acoustic image of the utterance, should be taken 
into account when considering the multiple facets of the construction of meaning 
transmitted to and shared with the co-utterer. Our contribution is therefore to add 
prosodic data input to the parameters of the Subject (S 0 ), and Time/space coordinates 
(T 0 ) in an analysis which is already resolutely cognitive, but grounded in individual 
discursive acts. The proposed method here first identifies the acoustic profile of discrete 
levels for the prominent points in the corpus utterances; these levels are then analyzed 
in terms of relations constructed on the basis of the semantic core of the utterance, or 
lexis relations which represent an initial cognitive level of choice (Culioli 1990,1995). 

In previous work (Schaefer, in press), I have attempted to show that four discrete 
prosodic levels can be shown to operate for personal pronouns in sentence initial 
position, where vowel reduction in the monosyllabic form is the most common. The 
four levels, corresponding to Prosodic Tevel o (encliticized form with vowel reduc¬ 
tion), Prosodic Level 1 (full phonetic form plus minor pitch extrusion) and Prosodic 
Level 2 (full form plus major pitch extrusion), are linked to specific discursive modes 
of reference. A fourth ‘minimal’ level (4) covers cases where the full phonetic form is 
maintained, but in the absence of pitch extrusion 4 . These levels are analyzable, in a 
linguistic sense, as abstract ‘paths’, which are the traces of the utterer’s choice within 
a cognitive paradigm (of a definable group of elements, i.e. the co-utterers in the 
Whitewater discussion—see note 1). 

On this view, Level o (or <t, if vowel quality is maintained) represents the simple 
selection of the element to which the pronoun refers, with no mention of the paradigm 
(i.e., others) to which it belongs. The Level 1 makes a distinction in the choice of an ele¬ 
ment in a paradigm (the choice is schematized as a path among other possible paths) 
without explicitly rejecting the other elements. Level 2 always corresponds to a specific 
choice which explicitly rejects another element or elements in the paradigm, and is con¬ 
tingent upon the pre-construction of the paradigm in the context of the situation. 

4. conclusion. With respect to the statements that we considered to be worth inves¬ 
tigating, we now come to the following conclusions. 

Pitch movement and perceived prominent accent are related. Not all of the promi¬ 
nent accents in the sentences are produced with a pitch movement, but the large 
majority (92.5%) are. The most prevalent form of movement is a jump up to the per¬ 
ceived prominent syllable (70%). In the remainder of cases, it seems that the duration 
of the vocalic portion of the syllable maybe responsible for the perception of promi¬ 
nence. In such cases, the length of the vowel approaches double the average length of 
all other vowels in prominent syllables of the same schema (uncorrected for inherent 
vowel articulation rates). 

Two distinct levels of pitch movement in the prominent words can be clearly 
distinguished from those in non-prominent words. The pitch movements in the 
super-prominent syllables are effected with a greater difference in frequency than 
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the pitch movements in the prominent /non-prominent words. In general we can say 
that the pitch movements in prominent words are more clearly effected than the pitch 
movements in the other words of the sentence. On the other hand, they are in no way 
related to a simple static point corresponding to an accented syllable bearing sentence 
stress, whether it be characterized as high, medium or low (H, M, L). 

The pitch movement in prominent words, occurring in the stressed syllable of 
the prominent words, corresponds to prosodic markers of linguistic operations. 
In the case of ‘I think..in initial position, these operations can be traced to turn¬ 
taking, positioning of one’s argument relative to those of one’s co-utterers, and to the 
evaluation of the importance of one’s arguments relative to the arguments advanced 
by one’s co-utterers. Depending on the positioning of the prominence, the utterer can 
modulate the meaning of his utterance. 

We have tried to show on a very small scale that a careful methodology can pro¬ 
vide provisional evidence for the production of meaning through the manipulation 
(on the part of the utterer) of prosodic markers. It is important to add that the adjunc¬ 
tion of prosodic schema is not simply an add-on to the syntactic or semantic core of 
the utterance, but in certain ways allows the utterer to manipulate the (acoustic) form 
of the utterance in ways which actually modify meaning. 

The prominent points in the utterances and corresponding prosodic levels pertinent 
to the construction of meaning permit the inclusion of acoustic analysis in the larger 
framework of the cognitive operations involved in the utterance act. Prosody, as an inte¬ 
gral part of the acoustic image of the utterance, must be taken into account when con¬ 
sidering the multiple facets of the construction of meaning transmitted to and shared 
with the co-utterer. Our contribution is therefore to add prosodic data to the parameters 
of the Enunciative Subject (S 0 ), and Time/space coordinates (T ) in an analysis which 
is already resolutely cognitive, but grounded in individual discursive acts. On the basis 
of these parameters of the speech event, problems of the functioning of discourse can 
be addressed more completely. Further evidence of this can only be found in extensive 
analysis of corpora, validating the proposed theory in a constant to-and-fro between 
the enunciative model and the speech data it is derived from. 


According to the theory, the ‘utterer’ or ‘enunciator’ is a linguistic concept, which refers to 
an abstract function in the production of an utterance. It is the central coordinate point 
in the situation of uttering. The utterer is also the origin of all successive choices that con¬ 
tribute to the construction of an utterance, and which make it unique. Any choice of an 
element at any point in the utterance is made in relation to a set or paradigm of elements, 
which could replace it at that point. The term co-utterer (co-enunciator) is given to the 
abstract function of the addressee as a necessary parameter in the situation of uttering, 
which is taken into account when the utterer makes these choices. 

For an account of the iconic motivation of prominent points in the utterance, see Rous- 
kov-Low (1993). 

Here, the goal of linguistics is ‘to apprehend language through the diversity of natural 
languages’ (Culioli 1990:179). This involves a quest for the invariants which underpin and 
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regulate language activity: ‘The goal is not to construct a single grammar, but to recon¬ 
struct, by a theoretical and formal process, the primitive notions, elementary operations, 
rules and schemata which generate grammatical categories and patterns specific to each 
language.’ This in turn implies evidence which is based on a theory of observable data: the 
analysis of authentic utterances, which helps develop a holistic model to account for all 
facts, including ambiguity, slips of the tongue, deformations, metaphors, etc. 

This distinction has been discussed at length in Schaefer (1998). 
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GEMINATES, NC CLUSTERS, AND 
WORD-MEDIAL CC SEQUENCES IN PONAPEAN 


Chang-Kook Suh 
Cheonan University 


in ponapean 1 , only geminates and place-linked (homorganic) nasal-obstruent clus¬ 
ters (hereafter NC clusters) can be found in word-internal syllables 2 . In other words, 
coda consonants are prohibited word-medially, except for geminates and place-linked 
NC clusters. On the other hand, word-final syllables can have single coda consonants 
or double, if they are geminates or homorganic NC clusters (ltd 1989:226). Onsets 
permit only single consonants, so the morpheme-initial geminates appear only when 
the morpheme is not word-initial. (Rehg 1986, cited in Levin 1989:39). 

In this paper I focus on word-medial CC sequences in Ponapean and make the 
following two points. First, vowel epenthesis in Ponapean satisfies PlOns (PlOns: 
If there are place features, then they must be in onsets [cf. Steriade 1995, Suh 1997]). 
Second, the geminate integrity effect (cf. section 2) is the result of a specific ranking of 
constraints: Max-IO » PlOns » Dep-I 0 3 .I also argue that vowel epenthesis is due 
to illicit coda consonants in Ponapean. That is, vowel epenthesis remedies the unac¬ 
ceptable quality of coda consonants, not the unacceptable quantity of consonants. 
According to Rehg (1981) and Rehg and Sohl (1979), word-medial biconsonantal 
clusters are split by the insertion of a vowel. On the other hand, geminates and place- 
linked NC clusters resist vowel insertion. At issue is the integrity effect of geminates 
and NC clusters in this language as well as the character of the coda consonants. 

1. word-medial cc sequences in ponapean. Our major concern is to account 
for the character of coda consonants and the integrity effect of geminates and NC 
clusters in Ponapean, which does not allow illicit coda consonants in word-medial 
position. Thus, in this paper we focus on word-medial consonant clusters only 4 . The 
following Ponapean examples are taken from Levin (1989), Ito (1989), McCarthy and 
Prince (1986), and Rehg and Sohl (1979): 


(1) a. arewalla 

‘to return to the wild’ 

kemmad 

‘to change into dry clothing’ 

urenna 

‘lobster’ 

nappa 

‘Chinese cabbage’ (loanword) 

b. nampar 

‘trade wind season 

nankep 

‘inlet’ 

dindil 

‘penetrate’ 
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(1) c. /ak-dei/ 
/ak-g w ur)/ 
/ak-tantat/ 
/kitik-men/ 
/p w iik-men/ 


[akedei] 

[akup w uij] 

[akatantat] 

[kitikimen] 

[p w iikimen] 


‘a throwing contest’ 
‘petty’ 

‘to abhor’ 

‘rat, indef.’ 

‘pig, indef.’ 


Examples given in (i)a have geminates. Examples in (i)b have place-linked or horn- 
organic NC sequences. (i)c shows that potential biconsonantal clusters resulting from 
morpheme concatenation are broken up by an epenthetic vowel. In summary, as we can 
see in the above data, vowel epenthesis in Ponapean is motivated to remedy the unac¬ 
ceptable quality of the coda consonants, not the unacceptable quantity of the conso¬ 
nants (i)c. In the following section, we will characterize the nature of the coda condition 
and the integrity effect in Ponapean. 


2. CHARACTERIZING THE CODA CONDITION AND THE INTEGRITY EFFECT. My goal in 

this section is to provide a general description of the coda consonants and geminate 
behavior known as integrity which has been widely discussed in the traditional rule- 
based approaches (Kenstowicz & Pyle 1971, Hayes 1986, Schein & Steriade 1986, etc.) 

Languages differ according to the closed/no closed syllable parameter (cf. Kaye 
1990). Some languages do not allow codas (e.g. Hawaiian, Desano, Fijian), while 
other languages allow codas, resulting in word-medial consonant clusters (e.g. 
Yawelmani, English, Arabic). In some languages which allow codas, only a restricted 
set of consonants make licit codas (e.g. Axininca Campa, Diola Fogny, Italian, Japa¬ 
nese, Lardil, Ponapean). 

To explain the peculiar aspects of the coda consonants found in many languages, 
the CodaCond and NocCda have been proposed in Optimality Theory (hereafter 
OT) literature (Prince & Smolensky 1993, McCarthy & Prince 1993a, ltd & Mester 1994, 
among others). Although Ito and Mester (1994) try to characterize the behavior of coda 
consonants by combining the CodaCond and NoCoda constraints with the concept 
of Alignment from McCarthy and Prince (1993b), their description is unsatisfactory 
and cannot be generalized. On the basis of Ponapean examples, I argue that these con¬ 
straints can be replaced by concrete constraints like PlOns, CodaSon, and others. 

Now, consider the geminate integrity issue. Hayes (1986:321) defines geminate 
integrity as given in (2): 

(2) Geminate Integrity: Insofar as they constitute two segments, long seg¬ 
ments (i.e. geminates) cannot be split by rules of epenthesis. 

To account for this special behavior of geminates, attention has focused on represen¬ 
tational properties that distinguishes geminates from singletons. This results from 
the unique branching geometry of geminates. In previous approaches (Hayes 1986, 
Schein & Steriade 1986), the integrity effect of geminates is explained by a universal 
constraint against crossing association lines as shown in (3) (cf. Goldsmith 1976): 
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(3) * C V C 



Rc Rc 


However, geminate integrity is just a general tendency and a more accurate descrip¬ 
tion needs to be made to accommodate such anti-integrity cases as Marshallese (cf. 
Goldsmith 1990, Suh 1996). Accordingly, in this paper, the integrity effect is accounted 
for in a radically different way by the universal constraint interaction model of OT 
(e.g. Prince & Smolensky 1993a; McCarthy & Prince 1993,1995; Suh 1997), not by the 
universal No Crossing Constraint (Goldsmith 1976). That is, integrity is explained in 
OT in which input-output pairs are evaluated by an procedure that checks all possible 
outputs for some input against a set of constraints. The constraints are universal and 
language variation is explained by the different rankings of constraints. 

3. AN OT ACCOUNT. 

3.1. the constraints needed for ponapean. For an OT account of Ponapean, the 
following four key constraints are proposed. They play an important role in the analy¬ 
sis of Ponapean word-medial examples. First, Max-IO and Dep-IO are core faithful¬ 
ness constraints crucial to the whole OT model. In place of the Parse/Fill type of 
system presented in McCarthy and Prince (1993a) and Prince and Smolensky (1993), 
in which the input is maintained as a literal substructure of the output, the notion of 
correspondence relation betweens representations plays a key role. Max-IO and Dep- 
IO are defined as follows: 

(4) Max-IO: Every segment of the input has a correspondent in the output 
(McCarthy & Prince 1995) 

(5) Dep-IO: Every segment of the output has a correspondent in the input 
(McCarthy & Prince 1993a, Prince & Smolensky 1993) 

For the analysis of coda consonants, we need a constraint on the coda: PlOns: 

(6) PlOns: If there are Place features, then they must be in onsets (Suh 1997, 
cf. Steriade 1995) 

The coda node will, in many cases, fail to license place features. The constraint PlOns 
has been developed from the discussions of Steriade (1982,1995), Ito (1988), Ito and 
Mester (1994), Heiberg (1993), Scobbie (1992), and Suh (1997), among others. The 
main idea of Steriade’s proposal (1995) is that the consonantal point of articulation 
features are directly licensed in the onset, indirectly so in the coda: 

‘[aF], where F is a consonantal point of articulation feature, must be licensed, in 
at least one associated segment, by membership in the onset.’ (Steriade 1995:43). 
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Thus, the coda consonants in geminates and in NC clusters are assumed to have no 
place features in the phonology. Instead, coda shares the same place features with the 
following element (i.e. onset) in the phonetic component by the coarticulation pro¬ 
cess. As Keating (1988) puts it: coarticulation occurs in part because segments may 
lack inherent specification for particular articulations’. If a coda lacks a place specifi¬ 
cation, it is coarticulated with the following onset. 

Finally, following the proposals for alignment by McCarthy and Prince (1993a, b, 
and c), I postulate Align(Wd-R, M-R) to capture the morpheme-final requirement 
on the surface at the right edge of a word: 

(7) Align(Wd-R, M-R): The right edge of every word coincides with mor¬ 
pheme final elements (] WD=)M) (cf. McCarthy & Princei993a, b, and c, 
1994; ltd & Mester 1994) 

As expected, this constraint plays an important role in the analysis of word-final 
consonant cases. 

3.2. ranking of the constraints. In Ponapean, vowel epenthesis is due to illicit 
coda consonants, not to the consonant clusters in syllable edges 5 . In actuality, the 
two consonants in word-internal position belong to two different syllables, and thus 
there is no problem of unacceptable consonant clusters in the syllable margins. Here, 
PlOns plays a central role in accounting for this phenomenon and is highly ranked 
in Ponapean. 

PlOns is higher ranked than Dep-IO (PlOns » DEP-IO). This allows vowel inser¬ 
tion to shift a problematic coda consonant to onset position in a new syllable. Thus, 
Dep-IO can be violated to satisfy PlOns. In place-linked NC clusters and geminates, we 
do not need to insert a vowel between the clusters violating Dep-IO constraint, since 
they already satisfy PlOns condition. I assume Align(Wd-R, M-R) and Max-IO are 
higher ranked than PlOns and Dep-IO. Below is the summary of the ranking which is 
relevant to the discussion of Ponapean word-medial consonant clusters: 

(8) Ranking of constraints for Ponapean: Align(Wd-R, M-R), Max-IO » 
PlOns » Dep-IO 

In the following section, a tableau analysis will be given to show how the integrity 
effect of geminates and NC clusters is explained under OT framework. It will be 
shown that the so-called geminate integrity effect can be drawn from the interaction 
of the constraints as the byproduct of the constraints, without any specific stipulation 
or treatment 6 . 

3.3. a tableau analysis. First, we will look at an analysis deriving the (geminate) 
integrity effect. Tableau 1 and Tableau 2 show how the integrity effect is produced. 
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/nampar/ 

Align(Wd-R, M-R) 

Max-IO 

PlOns 

Dep-IO 

us 3 a. nam.par 



* 


b. na.ma.par 



* 

*! 

c. na.mar 


*! 

* 


d. nam.pa.ra 

*! 



* 

e. nam.pa 

*! 

* 




Tableau 1. /nampar/ ‘trade wind season 


p 7 

/napa/ 

Max-IO 

PlOns 

Dep-IO 

bsi 3 a. q ct 

A nn 
/mi 

n a p a 




b. a a 

A A 

napa 

*! 



c. a a a 

/ / / 
n a p a p a 



*! 


Tableau 2. Inappal ‘Chinese cabbage’ 

Recall that the integrity effect permits no vowel insertion into homorganic NC clus¬ 
ters and geminates. Let us first look at homorganic NC clusters 8 . 

In NC clusters like /nampar/ ‘trade wind season, as shown in Tableau 1, candidate 
a is selected as the optimal output. Insertion of an epenthetic vowel between the NC 
cluster causes a violation of Dep-IO as well as PlOns (candidate b). Because of this, 
b is eliminated. In c, because of the deletion of a word-internal segment, we have 
also crucial Max-IO violation. Changes in word-final position do not help either, as 
shown in d and e. That is, addition or deletion of a word-final segment causes a fatal 
Align(WD-R, M-R) violation. Thus, they are all eliminated. 

Now let us consider geminate cases as shown in Tableau 2. In the case of /nappa/ 
‘Chinese cabbage’, which contains geminate in the middle of the word, the com¬ 
pletely faithful candidate, a, is optimal, just as in the case of homorganic NC clus¬ 
ters. Candidate b is eliminated because it has a fatal violation of Max-IO due to the 
underparsing of the mora. Candidate c is also out because it violates Dep-IO with 
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/ak-dei/ 

Max-IO 

PlOns 

Dep-IO 

a. ak.dei 


*! 


os- b. a.ke.dei 



* 

c. a. dei 

*! 




Tableau 3. lak-deil 1 a throwing contest’ 

the insertion of an epenthetic vowel into a geminate cluster. In this way, the gemi¬ 
nate integrity effect is produced. 

Finally, we turn to the case in which vowel epenthesis separates two consonants. 
Here, the CC sequence does not constitute a well-formed coda-onset sequence. Thus, 
vowel epenthesis is required. This type of epenthesis is different from those seen in 
such languages as Palestinian Arabic, Pero, Berber, etc.. In these languages, epenthe¬ 
sis resolves an unacceptable number of consonants in syllable edges. In Ponapean, 
however, vowel epenthesis is due not to an unacceptable number of consonants in the 
syllables, but to illicit coda consonants. 

The word /ak-dei/ a throwing contest’ (Tableau 3) has the prefix ak- ‘to demon¬ 
strate, demonstrating’ and the stem dei ‘far, far along’. The completely faithful candi¬ 
date, a, crucially violates PlOns, because coda consonant [k] has its own place feature 
in that position. Rather, the optimal candidate has an epenthetic vowel and no coda 
consonant, as shown in b. In b, PlOns is satisfied at the cost of violating the lowest- 
ranked constraint, Dep-IO. Case, c, satisfies PlOns, but it crucially violates Max-IO. 
Thus, it is eliminated from the competition. Among the candidates, b is selected as the 
optimal output form. This convincingly tells us that Dep-IO can be violated to satisfy 
a more highly-ranked constraint like PlOns. 

4. conclusion. It has been shown that medial non-homorganic CC sequences in 
Ponapean are split by an epenthetic vowel, which resolves an illicit coda problem by 
changing the illicit coda consonant into an onset consonant. This kind of vowel epen¬ 
thesis is differently motivated in nature from the common cases of vowel epenthesis, 
which are motivated to resolve an impermissible number of consonants (e.g. CCC). 

In addition to that, the (geminate) integrity effect arises as the byproduct of the 
constraints, which are independently motivated in the description of phonology. In 
particular, Max-IO, PlOns and Dep-IO play pivotal roles in accounting for that phe¬ 
nomenon in Ponapean and presumably in other languages, too. 

Finally, we might be able to explain coda-related phenomena without using 
CodaCond and NoCoda, which are cover terms and lack content. More substantial 
constraints like PlOns (and CodaSon, etc.) account for the behavior of the coda con¬ 
sonants. Moreover, PlOns and CodaSon constraints are independently motivated to 
explain phonological phenomena such as coda sonorantization (e.g. Persian, Hausa) 
and place assimilation and neutralization processes in many unrelated languages. 
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This study also sheds light on the typological study of vowel epenthesis caused by 
quality and quantity of the consonants in the syllable. 


Ponapean is a Micronesian language. It is a member of the Ponapeic subgroup. It 
belongs to Western Ponapeic, and Mokilese and Pingelapese belong to Eastern Ponapeic 
(Levin 1989:5). 

I am grateful to two anonymous reviewers for their valuable comments and suggestions. 
All errors are my own responsibility. 

See figures (4), (5) and (6) for their definitions. 

For a comprehensive analysis of Ponapean data within the Optimality Theory framework, 
the reader is referred to Suh 1997. 

The nature of the epenthetic vowel is beyond the scope of the present paper. For discus¬ 
sion see Rehg 1986. 

Geminates do not receive any special treatment in the OT framework, since there is no con¬ 
cept of rule matching against the input structure having single or double association lines. 
We assume geminates are underlyingly moraic. 

All notational conventions are those of OT. 
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the conference theme of the 28th lacus Forum 1 is the nature of linguistic evidence, 
and many papers present data from different disciplines that illustrate, directly or indi¬ 
rectly, what Lamb calls ‘the transparency illusion (1998:12)—the failure to discover and 
observe reliable data that reflects the functioning of our minds. According to Lamb, this 
failure appears to be partially due to the fact that our minds and their connection to 
the real world are infinitely complicated and elusive (see Lamb 1998:12,161), and, conse¬ 
quently, much of the evidence available to linguists is still restricted to the province of 
‘analytical linguistics or the analysis of texts (ibid 6,9). Lambs solution to this illusion 
is to propose a relational network, indicating that the scrutiny of our mental system, 
which is claimed to underlie our language use, is achieved by means of our associa¬ 
tive abilities. It goes without saying that the meanings of lexemes in particular are in 
one way or another closely connected especially when we recognize the existence of 
a great number of polysemous lexemes, and this line of thinking is also shared by the 
majority of cognitive linguists (see for instance Langacker 1991:3). Bearing this in mind, 
we would like to examine Lambs basic inquiry in the context of the conference’s main 
theme, the nature of linguistic evidence, from cognitive-communicative perspectives. 
The focus on these two aspects of language is essentially not a novel idea. In Stein and 
Wright (1995) and Tomasello (1998), for instance, the role of the speaker as a locutionary 
force is ultimately the main concern in the analysis of linguistic data. Although we do 
not consider subtle differences in the theoretical perspectives presented in these books, 
this paper shares the same assumption that the understanding of a linguistic structure 
makes use of cognitive and communicative operations under the control of the speaker. 
Thus, the main objective of this paper is to reveal exactly how these operations are inte¬ 
grated in the formation of the Icelandic perfect. 

The paper is organized as follows. In section 1 previous studies on the English per¬ 
fect based on Bybee and Dahl (1989), Bybee et al. (1994), and Carey (1996) are briefly 
summarized. Section 2 demonstrates the facts of the Icelandic perfect. We first show 
that the Icelandic perfect is used to express the way the speaker construes an event 
from the perspective of remoteness, and, second, by examining some historical texts, 
we demonstrate that occurrences of the perfect serve as evidence for the emergence 
of the speaker’s communicative strategies. In this light, the data from the Icelandic 
perfect signals that frequently cited semantic categories, ‘anterior’ and ‘resultative’, 
assigned to the perfect, might be regarded as general characterizations of the perfect 
but fail to pinpoint fine-grained functions operating within the perfect. Finally, we 
suggest, though tentatively, that the grammaticalization of the auxiliaries hafa and 
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vera arises from two interconnecting strategies, one being cognitive and the other 
communicative, rather than from a principle of unidirectionality (Bybee et al. 1994: 
12-15), which relies heavily on the distinction between anterior and resultative. 

1. brief summary of the English perfect. The perfect is often formed by two aux¬ 
iliary verbs, have and he, cross-linguistically. As often stated in the literature, these two 
perfect forms share distinct but closely related semantics. Generally, the have-perfect 
has the semantics of anterior such that the past event has some relevance at the time 
of utterance, while the he-perfect expresses the resultant state which arises directly 
from the past action at the point of utterance. Both forms are conceived of as the gen¬ 
eral expressions of current relevance (see Dahl 1985:133-35), the only difference being 
that the current relevance interpretation in the former can be correlated to discourse 
components, while the latter is always lexically determined. Thus, the crucial differ¬ 
ence between John has gone and John is gone is that the former can integrate a periph¬ 
eral meaning. For instance, the semantics of the former can evoke such a situation 
that we cannot go to the party because we are now three; entering into the hall is only 
allowed when four people group together. By contrast, the semantics of the latter is 
restricted to the resultant state that is derived directly from the lexical meaning of go, 
that is, Johns absence from the location; there are no peripheral or discourse-oriented 
interpretations. A diagnostic test to distinguish these two forms, which has been fre¬ 
quently cited, is the behavior of still. The stative nature of the resultative construc¬ 
tion allows the stative semantics imparted by still. Thus, John is still gone indicates 
Johns continued absence from the location, while John has still gone is assigned the 
dynamic semantics that no matter what you told him, John has gone anyway. Carey 
(1996) in this context remarks that the adverbial since also contributes to the distinc¬ 
tion. Since is compatible with have (I have seen her since Friday from Carey 1996:33) 
because it refers to the up-to-the-present time period, collocating with the semantics 
of the anterior. The resultative He is gone since Friday (ibid) is claimed to be ungram¬ 
matical 2 , because the he-perfect characteristically concerns the final state of the event, 
which is incompatible with the semantics of since. Whatever the shortcoming of these 
tests, the have-perfect emphasizes the process of a past action, whereas the he-perfect 
refers to the state which is the outcome of the past action. 

Another often-cited characteristic of resultative and anterior is the claim that 
resultative has developed into anterior unidirectionally (Bybee et al 1994: 68). More 
precisely, after the loss of the agreement between subject/object and a past participle, 
a dominant form with have gradually replaced instances of the he-perfect. Bybee 
and Dahl (1989:69-70), for instance, argue that a shift from resultative to anterior 
is also influenced by lexical restrictions of the main verb, such that resultatives are 
only formed from telic verbs 3 . Aside from these observations, the hypothesis of 
unidirectionality has been strengthened by studies of some languages in which the 
inferential interpretation developed out of the perfect with the loss of its existing 
anterior meaning (Bybee et al 1994:73). In this context, Dahl (1985:153) and Bybee et 
al. (1994: 96) mention a close developmental connection between result and inference 
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as a triggering factor of the latter, in that both functions correlate with the result of a 
past action; in the former the result exists due to a past action, while in the latter the 
inference is drawn on the basis of the result of a past action, and this link is claimed 
to give rise to the inferential use of the perfect. 

2. evidence from the Icelandic perfect. Like English and many other languages, 
Icelandic uses both hafa ‘have’ and vera ‘be’ to form the perfect. Despite this fact, the 
perfect in Icelandic diverges in many ways from the English perfect, although they 
share, at first glance, basic similarities: the £>e-perfect expresses the result of a past 
action (= resultative), while the have-perfect concerns a time period prior to the 
point of utterance (= anterior). Upon close investigation, however, we see that an 
understanding of the Icelandic perfect extends beyond this two-way distinction. We 
highlight the following two points. First, the crucial difference between be- and have- 
perfects in Icelandic lies in the degree of remoteness. Second, based on our inves¬ 
tigation of selected historical texts, we argue that the rise of the have-perfect with 
different semantics interacts with the speaker’s communicative strategies 4 . 

2.1. remoteness. We mentioned in section l that the crucial difference between 
anterior and resultative is the different emphasis on the past action in a sentence. 
From the since- diagnostic, for instance, it seems that anterior focuses more on the 
process of an action, whereas the focus of resultatives is on the existence of a newly 
arisen state . Bybee et al. (1994: 69) describe this difference in the following manner: 
‘A resultative... expresses the rather complex meaning that a present state exists as 
the result of a previous action. An anterior, in contrast, expresses the sense that a past 
action is relevant in a much more general way to the present moment’. One problem 
we encounter with this claim is that the behavior of the Icelandic perfect does not 
coincide with diagnostic tests offered by explanations for English. The most striking 
evidence is that both hafa and vera in Icelandic can allow the adverbial enn ‘still’ and 
both impart the same dynamic, i.e., repetitive sense, as shown in (1): 

(1) a. Hann er enn farinn. b. Hann hefur enn farid. 

he is still gone he has still gone 

‘He (is) still gone again ‘He (has) still gone again 

As illustrated in (2), both hafa and vera can also permit sidan ‘since, after’, referring to 
the time-span between the past and present: 

(2) a. Hann erkominn sidan igcer. b. Hann hefur komid sidan igcer. 

he is come since yesterday he has come since yesterday 

‘He (is) come since yesterday’ ‘He (has) come since yesterday’ 
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As indicated in (3), vera is in fact compatible with dynamic adverbial expressions 
such as iflyti ‘in a hurry’, which contradicts the interpretation of the fee-perfect as an 
exclusively resultant state: 

(3) a. Hann erfarinn iflyti. b. Hann hefurfarid iflyti. 

he is gone in.a.hurry he has gone in.a.hurry 

‘He (is) gone in a hurry’ ‘He (has) gone in a hurry’ 

These examples signal that the definition of resultative and anterior, as given by Bybee 
et al. (1994:54-55) and others, does not hold neatly for Icelandic. Our explanation for 
this behavior of the Icelandic perfect is that vera should be understood as referring to 
the time span from the past to the present in much the same way as hafa; they are two 
variants of the expression of the perfect event. One piece of evidence for this claim is 
that vera can co-occur with a past adverbial like igcer ‘yesterday’. In Icelandic passives 
are formed by a past participle and an auxiliary fee, which is formally identical to the 
perfect auxiliary: both have the form er in (4). Interestingly, however, as indicated in 
(4)c, passive constructions 5 do not allow igcer, and the adjectival construction (4)d, 
behaves in the same manner. The reason for this discrepancy might be that these two 
constructions do not concern the time period from the past to the present but only 
express the present state. In other words, despite the fact that these three construc¬ 
tions formally resemble each other, the perfect differs significantly from passives and 
adjectivals from a cognitive point of view. 

(4) a. Veggurinn er brotnadur igcer. (fee-perfect) 

wall.the is broken yesterday 
‘The wall (is) broken yesterday’ 

b. Veggurinn hefur brotnad igcer. (feave-perfect) 
wall.the has broken yesterday 

‘The wall (has) broken yesterday’ 

c. *Veggurinn er brotinn igcer. (passive) 
wall.the is broken yesterday 

*‘The wall is broken yesterday’ 

d. *Veggurinn er bldr igcer. (adjective) 6 
wall.the is blue yesterday 

*‘The wall is blue yesterday’ 

As shown in (5), however, a time adverbial such as rett ddan ‘just now’, which refers 
to the recent past, is only felicitous with vera, not with hafa. Interestingly, as exempli¬ 
fied by (6) and (7), vera is not possible when the sentence expresses the remote or 
distant past; dinosaurs lived thousands years ago and have no relevance to the current 
situation, and, likewise, the expression of five years ago is more distant than that of 
yesterday (see examples (4)c and U)d). 
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Viga-Glums saga 

Present 

Preterite 

Perfect 

Pluperfect 

(Gluma) 



hafa 

vera 


total 3232 

1344 

1709 

75 

7 

97 

percentage 100% 

41.6% 

52.9% 

2.3% 

0.2% 

3 % 


Table 1. Tokens and percentages of different tense categories in Gluma. 


functions 

ffa/a-perfect 

Current relevance 

Extended-now 

Inferential 

Experiential 

total 75 

3 

15 

22 

35 

percentage 100% 

4% 

20% 

29% 

47 % 


Table 2. Tokens and percentages of different functions of the perfect in Gluma. 


(5) a. Hann erfarinn rett ddan. 

he is gone just now 
‘He (is) gone just now’ 

(6) a. *Risaedlur eru hlaupnar her. 

dinosaurs are run here 
‘Dinosaurs (are) run here’ 

(7) a. IHann erfarinn fyrirfimm drum. 

he is gone five years ago 
‘He (is) gone five years ago’ 


b. IHann hefurfarid rett ddan. 
he has gone just now 
‘He (has) gone just now’ 
b. Risaedlur hafa hlaupid her. 
dinosaurs have run here 
‘Dinosaurs (have) run here’ 
b. Hann hefur farid fyrir fimm drum. 
he has gone five years ago 
‘He (has) gone five years ago’ 


Examples (5) to (7) support our claim that the difference between the be and have- 
perfects in Icelandic has something to do with our cognitive encoding of an event or, 
more precisely, our recognition of the degree of remoteness; the speaker profiles or puts 
emphasis on an event which is closer to him in the he-perfect, whereas in the have-per¬ 
fect the speaker profiles an event which is remote from him. 


2.2. communicative strategies. When we look at an Old Icelandic saga text such 
as Viga-Glums saga (Gluma), written around 1330, as shown in table 1, it is notewor¬ 
thy that the preterite is the most frequently occurring verb form (52.9%), while the 
frequency of the perfect is much lower (2.3%). Table 2 shows the percentage of each 
function assigned to the perfect (total 75) in Gluma 7 . 

It is worth mentioning that many preterite forms in Old Icelandic can be replaced 
by the perfect in Modern Icelandic; in other words, the preterite still retained the 
form whose function in a given situation would correspond to that of the present-day 
perfect. The text in (8) contains hefir komit‘have come’ and drap ‘killed’; the perfect 
form for the latter would be the most preferred form in Modern Icelandic given the 
presence of the time adverbial i dag ‘today’. 
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Newspaper articles 

Present 

Preterite 

Perfect 

Pluperfect 

hafa 

vera 

total 4145 

1823 

1815 

279 

27 

210 

percentage 100% 

43.9% 

43 - 7 % 

6.7% 

0.6% 

5.1% 


Table 3. Tokens and percentages of different tense categories in MorgunblaSiS 
(2000-01). 

(8) Hann svarar: “pat er satt - eigi hefr mer 1 hug komit 

he answers.it is true - not have me.DAT in mind.ACC come 

at segja”, kvad Glumr,” at ek drap Sigmund Porkelsson t dag”. 

to say”, said Glumr.NOM that I killed Sigmundr.ACC borkelsson today 

‘He answers: “It is true that it has not come to my mind, Glumr said, that I 
killed Sigmundr borkelsson today’” ( Gluma , ch. 8, p. 25) 

Co-existence of the preterite and perfect in (8) suggests that preterite forms were 
probably gradually replaced with perfect forms. In fact, we can often observe fluctua¬ 
tions between these two forms in Gluma, as shown in example (9), where t dag is 
accompanied by the perfect and preterite in direct speech, both expressing the same 
semantic content of completed action with relevance at the point of utterance. 

(9) Sidan mcelti Glumr vid GuSbrand: “pu hefr mikillar 

since said Glumr.NOM to GuSbrandr.ACC you have great 

frcegdar a fat per i dag, er pu lagder at jordu 

celebrity.GEN gained you.DAT today when you put to earth.DAT 
Porvald krok ok mikit lid veittir pu oss i dag”. 

borvaldr.ACC hook.ACC and much support.ACC gave you us today 

‘After that, Glumr said to GuSbrandr: “You have gained great celebrity 
today, when you killed borvaldr the hook (literally: when you placed bor- 
valdr the hook to earth) and you gave us great support today” ’ (Gluma, ch. 
23, p. 6) 

As shown in Table 3, extracted from selected current newspaper articles in Modern 
Icelandic 8 , although the preterite is still used dominantly (the difference from the 
present is only 0.2%), the percentage of use of the perfect has clearly increased. This 
might point to the stabilization of the perfect form and its semantic functions in 
Modern Icelandic. Table 4 indicates that functions associated with the perfect have 
changed in the course of time and in Modern Icelandic inferential and experiential 
are no longer the dominant uses, whereas extended-now and current relevance are 
more frequently used 9 . 
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functions 

Ha/fl-perfect 

Current relevance 

Extended-now 

Inferential 

Experiential 

total 279 

134 

97 

22 

26 

percentage 100% 

48% 

34.8% 

7.9% 

9.3% 


Table 4. Tokens and percentages of different functions of the perfect in MorgunblaSicS 
(2000-01). 

With this background, we argue that the gradual change from the preterite to the 
perfect and, hence, their simultaneous co-occurrence in scenes in Gluma is to be 
interpreted as a sign of a shift in the communicative strategies the speaker was using 
to adjust his expressive attitude towards the speech event he was involved in. In what 
follows, two explanations are presented. 

2 . 2 . 1 . expressive attitudes. In this subsection, we concentrate on inferential function, 
demonstrating why it was preferred in old texts, while it is no longer prefered in Modern 
Icelandic. We assume that the inferential in old texts has a wider range of semantics, 
including the functions of present-day modal adverbials such as surely or certainly. For 
instance, as seen by the contrast between ( 10 ) and ( 11 ), the speaker’s expressive power 
is made clearer when adverbs such as vissulega or drugglega, both meaning ‘surely’, are 
added to the Modern Icelandic sentence ( 11 ); they emphasize the force of Glumr’s 
conviction about his inference from what people reported to him (that they could not 
recognize the target, Skuta, and were deceived by his false name and disguise). In ( 10 ), 
by contrast, the perfect alone is used to convey the same situation. 

(10) “Nu hefir orbit rddfatt”, segir Glumr, “par hafe per 

now have become at loss said Glumur.NOM there have you 

Skutu fundit... 

Skuta .acc found 

‘ “Now you have been helpless”, said Glumr, “There you have found 
Skuta...” ’ (Gluma, ch. 16, p. 47) 

(11) “Nu hefur vissulega ordid radafdtt”, segir Glumr, “par hafid 

now have surely become at loss said Glumur.NOM there have 

pid drugglega fundid Skutu... 

you surely found Skuta.ACC (Modern version of (10)) 

Modal adverbials are found, though not frequently, with the perfect in newspaper 
articles in Modern Icelandic. They serve to strengthen the locutor’s expressive power. 
In (12), for instance, the presence of drugglega ‘surely’ reinforces what the children 
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judge from the weather in Iceland, in addition to the more basic evidential function 
of the perfect based on prima facie result (= less rain in Iceland). 

(12) Po sogdu krakkarnir ad vedrid her hafi 

nonetheless said children.the.NOM that weather.the.NOM here have.suBj 
orugglega verid betra en heima hjd peim par sem miklar 
surely been better than home with them because great 
rigningar hafi verid i Frakklandi ad undanfornu. 

rain.NOM.PL have.suBj been in France.DAT recently 

‘Nonetheless the children said that the weather here has certainly been 
better than in their home country because there has been much rain 
recently in France’ ( Morgunbladid , 21. July 2001: p.4) 

From these differences, we suggest that the perfect in Old Icelandic was used to trans¬ 
mit the existing speaker’s communicative needs. 

2.2.2. frequent occurrence in direct speech. Another striking characteristic of the 
use of the perfect in Old Icelandic is that it frequently appeared in direct speech. All 
perfect forms in Gluma save one, for instance, only appear in direct speech; Egils saga, 
written around 1200, contains 29 instances of the perfect, 27 of which appear in direct 
speech (cf. Nordal 1993:50-59). Interestingly, in modern texts perfect forms appear 
regardless of text type; there is no tendency for the perfect to be linked specifically to 
direct speech (see Yamaguchi & Petursson in preparation). Researchers agree that direct 
speech, particularly when integrated into narrative texts, reinforces the interpersonal 
involvement of the speaker and the use of the first-person makes the narration more 
vivid (Tannen 1986:312). Given this perspective, then, the dominant occurrence of the 
perfect in direct speech supports our hypothesis that the new functions of the perfect 
related to the speaker’s expressive power emerged first in direct speech, because it fulfills 
that function more successfully. 

3. conclusion. This paper demonstrates that cognitive and communicative mecha¬ 
nisms have a bearing on the understanding of a language structure such as the Icelan¬ 
dic perfect. Our investigation suggests a new picture of the Icelandic perfect from the 
perspectives of the degree of remoteness, on the one hand, and the speaker’s pragmatic 
involvement in a speech event, on the other. Although the results demonstrated in this 
paper are a tentative report from our ongoing research, we believe that the findings shed 
light on the conference theme, the nature of linguistic evidence. The findings point us in 
the direction of recognizing the nature of mind and support the cognitive and commu¬ 
nicative aspects of a linguistic system that are part of Lamb’s neuro-cognitive approach. 
Hence, as a final point, the study of the Icelandic perfect, particularly its historical 
aspect, implies that the emergence of various functions such as resultative, anterior, 
and inferential, might not be brought about exclusively as the result of a principle of 
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unidirectionality, and this further hints at the invalidity of the suggestion of Hopper 
and Traugott (1993:66-67) that the process of grammaticalization essentially excludes 
the component of communication. 


We are grateful to anonymous reviewers for their useful comments. 

It is not ungrammatical, though its appropriateness certainly depends on context. We are 
grateful to Arle Lommel for pointing this out to us. 

See Yamaguchi and Petursson (to appear) for counterarguments to this claim. 

Note that we mainly deal with the perfect with hafa in this paper. For an elaborate study 
of the vera- perfect in Icelandic, see Yamaguchi and Petursson (to appear). 

We do not make a distinction between verbal or adjectival passive constructions, since 
both forms do not allow the time adverbial such as i gcer ‘yesterday’. Thus, {dag ‘today’ is 
compatible with the passive: Veggurinn er brotinn t dag af Joni ‘The window is broken by 
John today’ 

Note that these sentences become grammatical, as we envisage, when the past tense form 
is used for the finite verb: Veggurinn var brotinn t gcer‘The wall was broken yesterday’ and 
Veggurinn var blar igcer. ‘The wall was blue yesterday’ 

All numbers are tokens. 

We have investigated 32 articles in a daily newspaper, Morgunbladid , from July 2000 to July 
2001. 

Note, however, that many tokens for current relevance are replaceable with the preterite 
forms, while the other three functions are not easily replaceable. 
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